pola-rs / nodejs-polars

nodejs front-end of polars
https://pola-rs.github.io/nodejs-polars/
MIT License
405 stars 42 forks source link

`readRecords` should not discard `null` values #256

Open ad-si opened 1 month ago

ad-si commented 1 month ago

Have you tried latest version of polars?

What version of polars are you using?

0.15

What operating system are you using polars on?

macOS Sonoma 14.6.1 arm64

What node version are you using

bun 1.1.26

Describe your bug.

readRecords discards columns with all null values.

What are the steps to reproduce the behavior?

import { DataFrame, DataType, pl } from "nodejs-polars"

console.log(
  pl.readRecords([
    { name: "John", color: "green", height: null },
    { name: "Anna", color: "green", height: null },
  ]),
)

What is the actual behavior?

shape: (2, 2)
┌──────┬───────┐
│ name ┆ color │
│ ---  ┆ ---   │
│ str  ┆ str   │
╞══════╪═══════╡
│ John ┆ green │
│ Anna ┆ green │
└──────┴───────┘

What is the expected behavior?


shape: (2, 3)
┌──────┬───────┬────────┐
│ name ┆ color ┆ height │
│ ---  ┆ ---   ┆ ---    │
│ str  ┆ str   ┆ str    │
╞══════╪═══════╪════════╡
│ John ┆ green ┆ null   │
│ Anna ┆ green ┆ null   │
└──────┴───────┴────────┘
ad-si commented 1 month ago

Also: readJSON does the right thing.

console.log(
  pl.readJSON(JSON.stringify([
    { name: "John", color: "green", height: null },
    { name: "Anna", color: "green", height: null },
  ])),
)
Bidek56 commented 1 month ago

Please pass a schema to get your expected results. Thx

const rows = [
  { name: "John", color: "green", height: null },
  { name: "Anna", color: "green", height: null },
];

const schema = { name: pl.String, color: pl.String, height: pl.String,};

console.log( pl.readRecords(rows, { schema } ));
Bidek56 commented 14 hours ago

@ad-si Does the above solution works for you? Can this be closed? Thx

ad-si commented 14 hours ago

Your solution is a workaround and not a fix. So the issue itself still stands.

ad-si commented 14 hours ago

Except one issue in my expected behavior table: The column should actually have the type Unknown https://docs.pola.rs/api/python/stable/reference/api/polars.datatypes.Unknown.html#polars.datatypes.Unknown