IBM / data-prep-kit

Open source project for data preparation of LLM application builders
https://ibm.github.io/data-prep-kit/
Apache License 2.0
307 stars 134 forks source link

use str as document_hash #798

Closed dolfim-ibm closed 1 week ago

dolfim-ibm commented 1 week ago

Why are these changes needed?

pyarrow has issues with uint64 types, hence we better switch to str

Related issue number (if any).

Should address also this #794