LLNL / zfp

Compressed numerical arrays that support high-speed random access
http://zfp.llnl.gov
BSD 3-Clause "New" or "Revised" License
768 stars 155 forks source link

Running issues with zfp and cuda #218

Closed qwj118 closed 10 months ago

qwj118 commented 10 months ago

Hello, I am a beginner combining zfp and cuda. I have encountered some problems and would like to ask for advice. I executed these compilation commands:

cd zfp
mkdir build
cd build
cmake .. -DZFP_WITH_CUDA=ON -DBUILD_TESTING=ON -DZFP_WITH_OPENMP=OFF
make
ctest

I want to execute the compress function on the host side and the decompress function on the device side. I learned from the official documentation that currently only fixed speed mode can be used. But I encountered an error message during compilation. Remind me:

error: calling a __host__ function("zfp_field_1d") from a __global__ function("addKernel") is not allowed
error: identifier "zfp_field_1d" is undefined in device code
......

The functions used were all prompted with errors. But I don't know where I used it incorrectly. Can you help me take a look? Or could you provide a correct example for me to learn from? Thank you!

lindstro commented 10 months ago

The zfp API currently exposes only host functions. To do what you want, you should allocate a buffer for the decompressed data using cudaMalloc and pass the resulting device pointer to zfp_field_1d on the host. When you later call zfp_decompress, zfp will determine if any data movement is needed. If the compressed data resides on the host, it will first be copied to the device before decompression occurs.

You can take a look at the zfp command-line utility (utils/zfp.c) for how this is done. However, it only uses host pointers, even if (de)compression is done on the GPU.

lindstro commented 10 months ago

@qwj118 Does this answer your question? If so, I'd like to go ahead and close this issue.

qwj118 commented 10 months ago

Okay, thank you for your answer. I will close this issue.