NVIDIA / cccl

CUDA Core Compute Libraries
https://nvidia.github.io/cccl/
Other
1.21k stars 153 forks source link

Don't call the predicate twice in remove_if #702

Open mcarilli opened 8 years ago

mcarilli commented 8 years ago

Hello Thrust Team,

While playing around with learning Thrust, I happened to put a printf statement in the predicate-functor passed to thrust::remove_if(), and noticed some curious behavior. When run on host_vectors (ie on the CPU) the printf statement is output once per element, but when run on device vectors, the printf statement is output twice per element.

According to this, Thrust uses a parallel scan to determine output positions for the non-removed elements in a stable way, so it must do something like run the predicate on each element, then create an array of "hit" tags corresponding to each element with hit[i] = 1 if predicate( array[i] ) == false (don't remove) and hit[i] = 0 if predicate( array[i] ) == true (do remove). An exclusive scan on hit then gives the output positions for filtered elements of array...I'm just saying all this to highlight that I can't for the life of me imagine why the predicate functor would need to be called twice per element. What am I missing??

I have attached a minimal working sample, identical to the remove_points2d.cu example except that it has a printf statement in the predicate functor and includes device_vector.h in addition to host_vector.h. If x and y are host_vectors, "In here" gets printed 20 times, but if x and y are device_vectors, it gets printed 40 times.

remove_points_devicevshost.cu.txt

Thanks in advance for your insight, Michael

3gx commented 8 years ago

This is Thrust implementation detail of how reduce_if is implemented: the first time predicate is called for scan operation, and the second time to check whether the element needs to be copied (remove_if is a wrapper call to copy_if with a negated predicate). This implementation detail may change in future and one should not rely that reduce_if is called twice.

Hope this helps.

mcarilli commented 8 years ago

I don't think you need to call the functor the second time. I think you could do something like the following: In the hit array record a 1 for each element you want to keep, as I described above. Then run the scan on hit. On the second pass over the data elements, instead of rerunning the functor on element i, you can just compare hits_scanned[i] with hits_scanned[i+1]. If hits_scanned[i+1] = hits_scanned[i]+1, it means the exclusive scan recorded a hit at location i, and the element should be copied. There's nothing stopping you from doing the scan in-place so hits_scanned could be the same memory as hit but I renamed it to hopefully make my point clearer.

Thank you for the information in any case. I was mostly seeking reassurance that I hadn't somehow screwed up my usage of the thrust functions.

3gx commented 8 years ago

Agree, there is no need having two predicate invocations, and as I said, this is something that may change in future.

brycelelbach commented 7 years ago

Yah, this is and remains a problem in many places. Note that, by design, the C++17 parallel algorithms relax the complexity guarantees of the sequential algorithms, to allow flexibility and redundant computations. However, there are many places in Thrust where we go overboard with that, and assume (either by design or by mistake) that all user-defined types are TriviallyCopyable, all Callables are pure and that they can be called on invalid data.

brycelelbach commented 6 years ago

Tracked by NVBug 2062288.

mcarilli commented 6 years ago

Ha, i’m actually at Nvidia now. I joined Frameworks two and a half months ago. Come to think of it, i sit pretty close to @egaburov. I’ll say hi tomorrow.

brycelelbach commented 6 years ago

Ha! Small world. I'm in endeavor, 2nd floor, 3G044. I should be around most of the day, except around lunch.


Bryce Adelstein Lelbach aka wash ISO C++ Committee Member CppCon and C++Now Program Chair Thrust Maintainer, HPX Developer CUDA Convert and Reformed AVX Junkie

Sleep is for the weak


From: mcarilli [notifications@github.com] Sent: Tuesday, February 13, 2018 9:34 PM To: thrust/thrust Cc: Bryce Lelbach; Assign Subject: Re: [thrust/thrust] Don't call the predicate twice in remove_if (#836)

Ha, i’m actually at Nvidia now. I joined Frameworks two and a half months ago. Come to think of it, i sit pretty close to @egaburovhttps://github.com/egaburov. I’ll say hi tomorrow.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHubhttps://github.com/thrust/thrust/issues/836#issuecomment-365501294, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AAYTclRMq6YYOeQHhrdt03SpPCvnxOg8ks5tUnBigaJpZM4J-eQO.


This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.