Any fast way to copy a GPU array to CPU?

andyyankai commented 4 years ago

Is there a good method to copy a FloatC into a cpu array? I am currently using pytorch's .cpu(), but it seems can be very slow if the graph is too complex. Is there any better way either in c++ or python?

Speierers commented 4 years ago

I believe you can get a pointer accessible from the CPU with the following code:

auto my_cpu_data_ptr = my_gpu_data.managed().data();

From this pointer you should be able to create your CPU array. Does this answer your question?

andyyankai commented 4 years ago

I am not copying the pointer but want to access the value for a GPU array, for example I want to read the value of a FloatD and save its value and the gradient into an .exr file. However, since I used monte carlo on that FloatD which makes the graph of that variable really complex, I am wondering if enoki has a solution for this. Just to be clear, if I have

FloatD temp = zero<FloatD>(1000000)
pdf = 1/1000000
for i in range(0, 1000000):
  temp += function(temp) * pdf

after this I can try

image = temp.torch().cpu() // really slow
loadtoOpenEXR(image)

so I am curious if mitsuba2 have this problem or there is any way to directly save a GPU enoki array into openEXR

Speierers commented 4 years ago

What we often do is temp.numpy() and then reshape to get a 2D image from a enoki array in Python. This might be faster than torch().

merlinND commented 4 years ago

Also, I don't think there would be a reason for the size of the graph associated to your FloatD to influence the time it takes to copy it to the CPU (at least when using an Enoki-only method).

mitsuba-renderer / enoki

Any fast way to copy a GPU array to CPU? #81