There are small differences in duplicate rates between CPU and GPU runs. These mostly come from the checkHitspLS kernel as is discussed in #323. The differences simply come from running the duplicate removal in parallel, which makes sense since threads work concurrently and can end up removing elements at the same time. Running the CPU code in parallel makes the difference smaller.
I'd say this is not really an issue, since we understand what's happening and it's fairly small, but I'll leave this open in case we forget and want to look back at what's the source of the difference.
There are small differences in duplicate rates between CPU and GPU runs. These mostly come from the
checkHitspLS
kernel as is discussed in #323. The differences simply come from running the duplicate removal in parallel, which makes sense since threads work concurrently and can end up removing elements at the same time. Running the CPU code in parallel makes the difference smaller.I'd say this is not really an issue, since we understand what's happening and it's fairly small, but I'll leave this open in case we forget and want to look back at what's the source of the difference.