pola-rs / nodejs-polars

nodejs front-end of polars
https://pola-rs.github.io/nodejs-polars/
MIT License
437 stars 44 forks source link

`readRecords` should not discard `null` values #256

Closed ad-si closed 1 month ago

ad-si commented 2 months ago

Have you tried latest version of polars?

What version of polars are you using?

0.15

What operating system are you using polars on?

macOS Sonoma 14.6.1 arm64

What node version are you using

bun 1.1.26

Describe your bug.

readRecords discards columns with all null values.

What are the steps to reproduce the behavior?

import { DataFrame, DataType, pl } from "nodejs-polars"

console.log(
  pl.readRecords([
    { name: "John", color: "green", height: null },
    { name: "Anna", color: "green", height: null },
  ]),
)

What is the actual behavior?

shape: (2, 2)
┌──────┬───────┐
│ name ┆ color │
│ ---  ┆ ---   │
│ str  ┆ str   │
╞══════╪═══════╡
│ John ┆ green │
│ Anna ┆ green │
└──────┴───────┘

What is the expected behavior?


shape: (2, 3)
┌──────┬───────┬────────┐
│ name ┆ color ┆ height │
│ ---  ┆ ---   ┆ ---    │
│ str  ┆ str   ┆ str    │
╞══════╪═══════╪════════╡
│ John ┆ green ┆ null   │
│ Anna ┆ green ┆ null   │
└──────┴───────┴────────┘
ad-si commented 2 months ago

Also: readJSON does the right thing.

console.log(
  pl.readJSON(JSON.stringify([
    { name: "John", color: "green", height: null },
    { name: "Anna", color: "green", height: null },
  ])),
)
Bidek56 commented 2 months ago

Please pass a schema to get your expected results. Thx

const rows = [
  { name: "John", color: "green", height: null },
  { name: "Anna", color: "green", height: null },
];

const schema = { name: pl.String, color: pl.String, height: pl.String,};

console.log( pl.readRecords(rows, { schema } ));
Bidek56 commented 1 month ago

@ad-si Does the above solution works for you? Can this be closed? Thx

ad-si commented 1 month ago

Your solution is a workaround and not a fix. So the issue itself still stands.

ad-si commented 1 month ago

Except one issue in my expected behavior table: The column should actually have the type Unknown https://docs.pola.rs/api/python/stable/reference/api/polars.datatypes.Unknown.html#polars.datatypes.Unknown

Bidek56 commented 1 month ago

So I think the issue is in this line of code in core polars.