rapidsai / cudf

cuDF - GPU DataFrame Library
https://docs.rapids.ai/api/cudf/stable/
Apache License 2.0
8.09k stars 874 forks source link

[QST] OOM of cuDF (much smaller than GPU memory) #6126

Closed PerkzZheng closed 3 years ago

PerkzZheng commented 3 years ago

What is your question? df = cudf.read_csv('/mypath/train-noIndex.csv') I am just reading 11.5 GB files into GPU memory. (V100 32GB). No other operations. However, it leads to OOM (11.5 GB is much smaller than 32 GB). Are there some settings with RMM that may help to solve it ? And why it happened? Error: MemoryError: std::bad_alloc: CUDA error at: ../include/rmm/mr/device/cuda_memory_resource.hpp:68: cudaErrorMemoryAllocation out of memory

Env: Docker: rapidsai/rapidsai:0.15-cuda11.0-runtime-ubuntu18.04-py3.8

jlowe commented 3 years ago

Does the CSV file contain a lot of fields that are encoded much more efficiently in the text file than they would be once loaded into GPU memory? For example, take the following CSV file snippet containing long and double values:

longs,doubles,morelongs
0,1,2
3,4,5
1,2,3
0,0,0
[...]

Each field in this input only takes 2 bytes to encode in the CSV file, but each will be loaded into the GPU as a 64-bit value (INT64 or FLOAT64). That's a memory increase of 4x, which means an 11.5GB file would need much more than 32GB of data to hold the result, ignoring any overhead of the parsing operation itself.

Speaking of parsing overhead, I would expect the libcudf parser may need at least ~2x the memory during parsing as it holds both the input text data and the parse result output buffers simultaneously.

PerkzZheng commented 3 years ago

@jlowe Thanks for the reply! it really helped a lot, and I will try to read as text files.