IBM / data-prep-kit

Open source project for data preparation of LLM application builders
https://ibm.github.io/data-prep-kit/
Apache License 2.0
307 stars 134 forks source link

fix uint64 hash to pyarrow #793

Closed dolfim-ibm closed 1 week ago

dolfim-ibm commented 1 week ago

Why are these changes needed?

It seems pyarrow needs a bit of help when making a table from a uint64.

This is a minimal example which fails

import pyarrow as pa
pa.Table.from_pylist([{"binary_hash": 17915699055171962696}])

Related issue number (if any).

Fixes https://github.com/IBM/data-prep-kit/issues/767

sujee commented 1 week ago

well done chasing this one @dolfim-ibm :clap: