vaexio / vaex

Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
https://vaex.io
MIT License
8.22k stars 589 forks source link

[FEATURE-REQUEST] Getting dtype of columns as they are when rendered in a pandas dataframe? #2403

Open yohplala opened 7 months ago

yohplala commented 7 months ago

Description Hello, I would like to get the dtype of the columns as they are when the vaex dataframe is turned into a pandas dataframe. Basically, vaex is using its own dtype.

import vaex
df=vaex.from_items(("a",[1,2,3]),("b",[1.1, 2.1, 3.1]))
type(df.dtypes['a'])
Out[64]: vaex.datatype.DataType

But,

type(vf[:1].to_pandas_df().dtypes.to_dict()['a'])
Out[65]: numpy.dtype[int64]

Please, is there any way to get the result of the 2nd method without having vaex to compute a row? (in example above, I am making vaex computing the 1st row) If it is possible, I would like to prevent it, because I am using this information in a setup step.

Thanks for your help!