[FEA] Explore performance impact of enabling relocatable device code in libcudf

rapidsai / cudf

cuDF - GPU DataFrame Library

https://docs.rapids.ai/api/cudf/stable/

Apache License 2.0

8.29k stars 885 forks source link

[FEA] Explore performance impact of enabling relocatable device code in libcudf #1437

Open jrhemstad opened 5 years ago

jrhemstad commented 5 years ago

Is your feature request related to a problem? Please describe.

It would be nice to quantify the impact is of enabling reloctable device code in libcudf.

Describe the solution you'd like

A thorough study of micro-benchmark performance should be conducted comparing performance of key libcudf/cuDF operations with and without relocatable device code enabled.

The obvious candidate would be to start with the Air Speed Velocity benchmarks that have been created (but not yet released).

harrism commented 5 years ago

We probably need to create a set of microbenchmarks in order to do this... :)

jrhemstad commented 5 years ago

We probably need to create a set of microbenchmarks in order to do this... :)

Probably. I was hoping to at least start with the ASV benchmarks that Kevin has created.

vyasr commented 2 years ago

@jrhemstad @robertmaynard @harrism I've been doing some archaeology and I see that a number of discussions have taken place on the topics of RDC (with and without LTO). Did we ever successfully collect any benchmarks? Is this still something that we want to consider, or do we expect that RDC will always have an unacceptable performance cost for RAPIDS, even with LTO?

jrhemstad commented 2 years ago

Did we ever successfully collect any benchmarks?

No.

Is this still something that we want to consider,

Sure.

do we expect that RDC will always have an unacceptable performance cost for RAPIDS, even with LTO?

Never know until we try and see!