Enabling USE_ASYNC_STREAM

NVlabs / NVBit

199 stars 18 forks source link

Enabling USE_ASYNC_STREAM #67

Closed mahmoodn closed 2 years ago

mahmoodn commented 2 years ago

Hi I see #define USE_ASYNC_STREAM in channel.hpp. So, I would like to know in what circumstances, it is beneficial to enable that?

ovilla commented 2 years ago

When USE_ASYNC_STREAM is defined, we use a GPU buffer allocated with cudaMalloc and cudaMemcpyAsync from the CPU to pull from that buffer when it is full.

When USE_ASYNC_STREAM is not defined, we use a GPU buffer allocated with cudaMallocManaged and we normally read from it from the CPU when it is full relying on the UVM driver to move pages.

Depending on the system, the driver and the workload, one approach could be better but for most of the use cases USE_ASYNC_STREAM seems to be the best.

mahmoodn commented 2 years ago

Thanks. You mean "defined" is better for most of the workloads? I guess so... But the default is "undefined" in channel.hpp.

ovilla commented 2 years ago

This is what we have in version 1.5.3

Not sure if you are looking to an old file or a modified one.

mahmoodn commented 2 years ago

Yes I am using 1.5.3. I was thinking that in order to define the variable, I have to use #define USE_ASYNC_STREAM 1 but the default has no value. As I checked, even without any value, the #ifdef will be true. Thanks for clarification.