Closed gplanansky closed 8 months ago
Please use pl.scanCSV
until we have a PR for this issue. Thanks for understanding and binging this issue up.
const df = await pl.scanCSV( "data_tsv.txt" , { sep: "\t" }).collect()
The code only allows for these extensions: [".tsv", ".csv"]
else it it thinks it's a inline text.
https://github.com/pola-rs/nodejs-polars/pull/156 has a fix for this issue.
Using version 0.8.3
should work as well.
import pl from "npm:nodejs-polars@0.8.3";
thanks. And roger the allowed extensions -- I only used ".txt" here because paste, click to add files does not support the .tsv extension.
Can you please check: "nodejs-polars": "0.9.0" ? Thx
@Bidek56
It works, using nodejs-polars 0.9.0, data files with .txt extensions yield same correct results as data files with csv, tsv extensions.
yay!
Tested using the same example files:
$ ll
-rw-r--r-- 1 george staff 66621 Mar 9 02:49 data.csv
-rw-r--r-- 1 george staff 66621 Mar 9 02:49 data.tsv
-rw-r--r-- 1 george staff 66621 Mar 9 02:41 data_csv.txt
-rw-r--r-- 1 george staff 66621 Mar 9 02:41 data_tsv.txt
$ cat data_csv.txt | head -1
scalerank,featurecla,labelrank,sovereignt,sov_a3,adm0_dif,level,type,admin, ...
$ cat data_tsv.txt | head -1
scalerank featurecla labelrank sovereignt sov_a3 adm0_dif ...
$ diff data.csv data_csv.txt
$ diff data.tsv data_tsv.txt
$ deno
Deno 1.41.2
exit using ctrl+d, ctrl+c, or close()
REPL is running with all permissions allowed.
To specify permissions, run `deno repl` with allow flags.
> import pl from "npm:nodejs-polars";
undefined
> pl.pl.version
"0.9.0"
> let csv = await Deno.readTextFile('data.csv')
> let dfcsv = pl.readCSV(csv, { sep: "," });
> dfcsv.columns
[
"scalerank", "featurecla", "labelrank", "sovereignt", ...
> let tsv = await Deno.readTextFile('data.tsv')
> let dftsv = pl.readCSV(tsv, { sep: "\t" });
> dftsv.columns
[
"scalerank", "featurecla", "labelrank", "sovereignt", ...
> let data_csv = await Deno.readTextFile('data_csv.txt');
let df_csv = pl.readCSV(data_csv, { sep: "," });
> df_csv.columns
[
"scalerank", "featurecla", "labelrank", "sovereignt", ...
> let data_tsv = await Deno.readTextFile('data_tsv.txt');
> let df_tsv = pl.readCSV(data_tsv, { sep: "\t" });
> df_tsv.columns
[
"scalerank", "featurecla", "labelrank", "sovereignt", ...
What version of polars are you using?
version: "0.8.4"
What operating system are you using polars on?
mac os 13.6
What
nodedeno version are you usingdeno 1.39.2
Describe your bug.
Below,
pl.readCSV(data_tsv, { sep: "\t" });
on the tsv file fails to separate the data items, whereaspl.readCSV(data_csv, { sep: "," });
on the csv file succeeds.(This is from an example using polars in deno that evidently worked 4 months ago: https://github.com/rgbkrk/denotebooks/blob/main/10.2_Polar%20DataFrames.ipynb)
Running deno in a directory with the data_tsv.txt, data_csv.txt files:
data_csv.txt data_tsv.txt