NVIDIA / spark-rapids-jni

RAPIDS Accelerator JNI For Apache Spark
Apache License 2.0
36 stars 64 forks source link

[FEA] large string support #2190

Open jlowe opened 3 months ago

jlowe commented 3 months ago

libcudf now supports string columns with more than 2GB of character data. There are many places in the C++ and Java code that assume offsets are 4 bytes, and large string support violates that assumption, so we disabled libcudf's large string support (see #2189). We need to find and update all of those places before enabling large-string support.

jlowe commented 2 months ago

rapidsai/cudf#16215 tracks the necessary changes in the cudf submodule.