ImSoErgodic / py-upset

A pure-python implementation of the UpSet suite of visualisation methods by Lex, Gehlenborg et al.
313 stars 57 forks source link

Data input format? #31

Open G-kodes opened 4 years ago

G-kodes commented 4 years ago

Can you please include in your readme.md how to structure incoming data? I can't see anywhere what format my data frame needs to be in, in order to render a graph. The only solution is to grab and unpickle your test data which defeats the point of your readme.md instructions.

MahmoudAbdelRahman commented 4 years ago

The same issue here, please. Thank you

MahmoudAbdelRahman commented 4 years ago

@SgtPorkChops, I think this resource might be useful: http://data.caleydo.org/papers/2014_infovis_upset.pdf

monika0603 commented 4 years ago

Does the package only take pickle file as an input? How is a pickle file created from movie lens dataset?

macho9099 commented 2 years ago

It looks like that input must be a dictionary that contains pandas dataframes, however there are some issues with source code because methods like ix are depreciated

G-kodes commented 2 years ago

From what I have been able to tell, I needed to first convert my column data into a boolean form using .astype(bool), and then re-factor it into a single-column, multi-index, count form using .groupby(["Column1", "column2", ...]).count().

This made it LOOK like their data and what the docs describe, however the package still complained for me. What I then discovered is that the Package specifically wants a Series and not a DataFrame type object. I personally used the .iloc[0] function which returns the first column as a Series, but I believe there is a .squeeze() function which basically does this in-house for exactly this operation (Docs here).

Either way, I am not closing this issue as it is in fact a request for better documentation to explain this aspect of the package.

rLannes commented 2 years ago

That is too bad, won't lose my time figuring out what should be the data format. This issue has been around from 2 years. It is unfortunate, this looked nice. I am back to R.