Open oscar6echo opened 1 month ago
It does look like _reprhtml on the python side has quite a bit more logic than the JS side.
It should be pretty easy to copy over the python html logic.
I cannot reproduce the dataframe display shown in README (and below) - which is the same as in python-polars. Is it still possible ? else why not ?
It works fine for me using nodejs-polars
v. 0.14.0, can please show your code and the output?
deno-kernel jupyter notebook shows the first 50 rows in order to not crash the browser with large output, but it's configurable using: process.env.POLARS_FMT_MAX_ROWS
, this was discussed during the PR review.
I cannot reproduce the dataframe display shown in README (and below) - which is the same as in python-polars. Is it still possible ? else why not ?
It works fine for me using nodejs-polars v. 0.14.0, can please show your code and the output?
Here is the output from notebook, jupyter console and terminal.
1/ notebook
2/ jupyter console
3/ terminal
In neither of these cases can I reproduce the nice display shown in the README, which happens to be similar to that of the python version print(df)
.
What should I do to have it ?
I am not using Deno
but using bun
command line it works fine.
Ok, here is what i get with bun repl
:
So the output contains what is shown in the README nice display but it is not quite the same.
I find it a bit disconcerting that such basic use is not reproducible on either deno
or bun repl
.
deno-kernel jupyter notebook shows the first 50 rows in order to not crash the browser with large output, but it's configurable using: process.env.POLARS_FMT_MAX_ROWS, this was discussed during the PR review.
It does look like repr_html on the python side has quite a bit more logic than the JS side.
Indeed please compare the python (arguably reference and certainly more informative) version:
You get the shape and the first/last rows/cols shown (controlled by POLARS_FMT_MAX_(ROW|COL)S
).
While with nodejs-polars you get the first columns (controlled by POLARS_FMT_MAX_ROWS
) without indication of shape.
It should be pretty easy to copy over the python html logic.
Perhaps to somebody who knows the inner workings of (1) polars-py (2) polars-nodejs (3) the various specifics of the target runtimes, nodej
, deno
, bun
as the comment above shows they add their own layer before display.
For example I could not find where the selection of rows and cols (first and last selected based on env variables) is performed below polars/polars/dataframe/frame.py | __repr_html__
Can you please use: console.log(df);
in bun repl
? It works fine for me. I do re-call being an issue with bun implementation for [Symbol.for("nodejs.util.inspect.custom")]()
.
> console.log(df);
shape: (5, 4)
┌─────┬────────┬─────┬────────┐
│ A ┆ fruits ┆ B ┆ cars │
│ --- ┆ --- ┆ --- ┆ --- │
│ f64 ┆ str ┆ f64 ┆ str │
╞═════╪════════╪═════╪════════╡
│ 1.0 ┆ banana ┆ 5.0 ┆ beetle │
│ 2.0 ┆ banana ┆ 4.0 ┆ audi │
│ 3.0 ┆ apple ┆ 3.0 ┆ beetle │
│ 4.0 ┆ apple ┆ 2.0 ┆ beetle │
│ 5.0 ┆ banana ┆ 1.0 ┆ beetle │
└─────┴────────┴─────┴────────┘
Can you please use: console.log(df); in bun repl? It works fine for me. I do re-call being an issue with bun implementation for [Symbol.for("nodejs.util.inspect.custom")]().
I get the same output!
It would be good that nodejs:polars be explicit about what runtimes should implement to output the proper display (as in README). Maybe this is already the case ? If so where ?
Can you please use: console.log(df); in bun repl? It works fine for me. I do re-call being an issue with bun implementation for Symbol.for("nodejs.util.inspect.custom").
I get the same output!
It would be good that nodejs:polars be explicit about what runtimes should implement to output the proper display (as in README). Maybe this is already the case ? If so where ?
@oscar6echo The formatting discrepancy is because unlike python and rust, there is no native way to overload methods, so we need to use a Proxy object to support some syntaxes such as bracket notation: df['column']
console.log
should always print the correct output as most runtimes have standardized on using Symbol.for("nodejs.util.inspect.custom")
, but unfortunately, there is no way to forward the inspect symbol to the dataframe class when wrapping it in a proxy. So it's either drop support for the functionality that the proxy provides, or use console.log
.
Edit:
df.toString()
should also work the same as console.log(df)
@universalmind303 thx for the insight.
So the working syntax with deno
is console.log(df.toString())
.
Maybe a bit verbose but output identical to python version. This is useful info !
Examples:
1/ small df
2/ larger df
so we need to use a Proxy object to support some syntaxes such as bracket notation: df['column']
Ok this is your decision - who am I to debate it - but the .select()
syntax achieves the same, is central is polars-py, and more IDE friendly with completion etc. The df['mycol']
syntax seems mostly a contrived way to mimick pandas legacy API - I was a heavy pandas user and now an intensive polars-py one. One may argue this legacy API may not be worth keeping, in particular if it hinders basic user experience. :thinking:
But this is only a side remark.
The main point is: Congrats and thank you for putting together and maintaining nodejs-polars :+1:
Ok this is your decision - who am I to debate it - but the .select() syntax achieves the same, is central is polars-py, and more IDE friendly with completion etc. The df['mycol'] syntax seems mostly a contrived way to mimick pandas legacy API - I was a heavy pandas user and now an intensive polars-py one. One may argue this legacy API may not be worth keeping, in particular if it hinders basic user experience. 🤔
I have thought about deprecating the syntax as I too find the Proxy
stuff a bit annoying. I know py-polars
discourages the usage of it anyways.
This is less a feature request than a question:
But for nodejs-polars only the 50 first rows are shown without indication of df shape.
Is it on purpose or a shortcut ?
Suggestion: If would help users if the py/js displays both in print/console.log and jupyter would match.