Open NickCrews opened 2 years ago
I like the idea of having a a dtype for the dataframe, we already kind of do checks for that in DataFrame.__array__
, and we may possibly have some code for this in https://github.com/vaexio/vaex/pull/415 (where we want to know if a dataframe is of a homogenous type, so it can be treated as a 'matrix').
What should astype(some_dict_type)
do actually?
Description I was trying to cast a DF to a pyarrow schema and ran across this issue.
DataFrame[column].astype(pyarrow_type)
works fine for many pyarrow types such as float, string, bool. But it doesn't work forpyarrow.lib.DictionaryType
. However, if I use vaex's "wrapped" version, it works fine. I didn't explore other dtypes, but perhaps this also reveals a problem with other complex pyarrow types?See the xfail-ing test PR
(PS is there hope for a future
DataFrame.astype()
similar to pandas? I'm writing this myself and feels like I'm reinventing the wheel.)Software information
import vaex; vaex.__version__)
: {'vaex': '4.10.0', 'vaex-core': '4.10.0', 'vaex-viz': '0.5.2', 'vaex-hdf5': '0.12.2', 'vaex-server': '0.8.1', 'vaex-astro': '0.9.1', 'vaex-jupyter': '0.8.0', 'vaex-ml': '0.18.0', 'vaex-graphql': '0.2.0'}main
in dev environment