vaexio / vaex

Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
https://vaex.io
MIT License
8.23k stars 590 forks source link

[BUG-REPORT] Requesting a vaex-dataset that contains both numeric and categorical columns #2257

Open abf7d opened 1 year ago

abf7d commented 1 year ago

Description I am looking to create some unit tests based on a vaex dataset from the website. The ones that exist here don't contain any columns that are categorical. Can you add a category column or two to the existing datasets?

JovanVeljanoski commented 1 year ago

Which dataset are you referring to?

You could use

import vaex

df = vaex.datasets.titanic()

That has categoricals for sure :)

P.S.: If you are writing unit-test to expose vaex bug or something of that sort, it would be best to generate smallest possible example. We try to not use those datasets for unit-tests.