rapidsai / cudf

cuDF - GPU DataFrame Library
https://docs.rapids.ai/api/cudf/stable/
Apache License 2.0
8.39k stars 897 forks source link

Use grid stride in CSV reader kernels #14066

Open vuule opened 1 year ago

vuule commented 1 year ago

Currently, the CSV reader parses data using a thread per row, and a separate thread is used for each row, regardless of the file size. Using a grid stride loop would allow kernels to launch with preset number of blocks even with large input.

This applies both to the parser and the data inference kernels.

ANIKET0956 commented 11 months ago

Hi @vuule, if this issue is still open, can i give it a try and post updates?