facebookresearch / hiplot

HiPlot makes understanding high dimensional data easy
https://facebookresearch.github.io/hiplot/
MIT License
2.74k stars 138 forks source link

Axis not displayed when loading CSV #240

Open lw opened 2 years ago

lw commented 2 years ago

When I upload the attached CSV onto the hosted version of HiPlot some axes of the CSV do not appear in the parallel plot, whereas they correctly appear in the table below. If I right-click on the table's headers I'm not offered to restore these axes onto the plot. These axes are "Ratio" and "Log(ratio)".

Untitled spreadsheet - Sheet5.csv

lw commented 2 years ago

I think this could be related to the presence of some rows that have "!DIV/0" as a value, as that might confuse HiPlot: these axes can't be categorical (too many distinct values?) but can't be numeric either?

danthe3rd commented 2 years ago

Hi @lw

The columns are indeed categorised as "Categorical" as is visible in the table. HiPlot will not display categorical columns with more than 50 (I think?) distinct values on the parallel plot, because it does not make sense to slice through so many values. Now these columns are considered as categorical because they can't be numeric (eg non-numeric values). There is a limited list of non-numeric values allowed for numerical columns, and those include "nan", "inf", "null" etc... but not "#DIV/0!" at the moment.

image

lw commented 2 years ago

Ok, thanks! This doesn't need an immediate fix now that I know the root cause and I can clean my data. Having HiPlot show a warning would be nice though :)

danthe3rd commented 2 years ago

Leaving issue open for the "Warning" indication