Data conversion error from csv to DataPreview mode

RandomFractals / vscode-data-preview

Data Preview 🈸 extension for importing 📤 viewing 🔎 slicing 🔪 dicing 🎲 charting 📊 & exporting 📥 large JSON array/config, YAML, Apache Arrow, Avro, Parquet & Excel data files

https://marketplace.visualstudio.com/items?itemName=RandomFractalsInc.vscode-data-preview

Apache License 2.0

553 stars 59 forks source link

Data conversion error from csv to DataPreview mode #238

Open gaille34 opened 3 years ago

gaille34 commented 3 years ago

Hi, When I convert or visualize a .csv file with a value XX.0 in the first line of a column all values of the others lines of the same column loose the number after the point: -41.6 became -41. Please see the 2 attached screenshot. Best regards DataPreview_file csv_file

IlyaOrson commented 3 years ago

I hit a similar issue with this data.zip:

t,x1,x2,c1
0.0,1.0,3.0,1.453712e-11
0.01,0.99932516,3.0006747,1.4536232e-11
0.02,0.99864984,3.0013502,1.4535234e-11
0.03,0.99797404,3.002026,1.4534291e-11
...

imagen

Line plots are broken for the x1 and x2 columns, the other ones are parsed correctly.

RandomFractals commented 3 years ago

that's because your first line has ints and data type is detected based on the first data line.

also, not an error. I'd call it incorrect data precision issue at best :)

gaille34 commented 3 years ago

Ok I see better now. Could it be possible to configure/manage this "precision" manually ? Today I have to be cautious on large datasets and I need to put 43.001

to have this result:

Best regards

RandomFractals commented 3 years ago

yeah, I need to see if I can include more data rows for detecting ints and decimals, and there is a separate ticket for custom data types that I plan to address at some point (#156)

Meussdorffer commented 3 years ago

that's because your first line has ints and data type is detected based on the first data line.

also, not an error. I'd call it incorrect data precision issue at best :)

Not an error in the code, but certainly an error in methodology. I'd suggest implementing some random sampling to infer type / precision instead of naively using the first row. This tool is unusable for me because of this bug.

Meussdorffer commented 3 years ago

@Meussdorffer ok! I am always open to suggestions from devs with 0 commits :)

Always a great look when you patronize your users for making legitimate suggestions.