iterative / datachain

AI-data warehouse to enrich, transform and analyze data from cloud storages
https://docs.datachain.ai
Apache License 2.0
942 stars 55 forks source link

Support Inf in `Array(Float)` #386

Open dberenbaum opened 2 months ago

dberenbaum commented 2 months ago

Array(Float) columns are serialized as json using the orjson library, which does not support Inf values (see this issue) and will treat them as null/nan:

>>> DataChain.from_values(val=[[0.0, float("nan"), float("inf")]]).show()
Processed: 1 rows [00:00, 1862.48 rows/s]
Generated: 1 rows [00:00, 1987.82 rows/s]
Cleanup: 1 tables [00:00, 4346.43 tables/s]
               val
0  [0.0, nan, nan]