pola-rs / polars

Dataframes powered by a multithreaded, vectorized query engine, written in Rust
https://docs.pola.rs
Other
29.59k stars 1.89k forks source link

Serializing float columns with `format="json"` turns inf/nan values into null #17211

Open stinodego opened 3 months ago

stinodego commented 3 months ago

Checks

Reproducible example

import io

import polars as pl

df = pl.DataFrame({"a": [1.0, float("inf"), float("-inf"), float("nan")]})
ser = df.serialize()
print(ser)  # {"columns":[{"name":"a","datatype":"Float64","bit_settings":"","values":[1.0,null,null,null]}]}
result = pl.DataFrame.deserialize(io.StringIO(ser))
print(result)

Log output

shape: (4, 1)
┌──────┐
│ a    │
│ ---  │
│ f64  │
╞══════╡
│ 1.0  │
│ null │
│ null │
│ null │
└──────┘

Issue description

Roundtripping float columns doesn't work for nan/inf values.

Expected behavior

original values are preserved

Installed versions

main

ritchie46 commented 3 months ago

Ah yes. That's because JSON doesn't support those values.