vaexio / vaex

Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
https://vaex.io
MIT License
8.25k stars 590 forks source link

Add xfail test for df.astype(pa.DictionaryType) #2188

Open NickCrews opened 2 years ago

NickCrews commented 2 years ago

Demonstration of https://github.com/vaexio/vaex/issues/2187

maartenbreddels commented 2 years ago

So, it works for some cases, because our dtype wrapper compared true to str when it's an encoded type that maps to str. Next problem is, that because of the expression system, we would need to add the types passed in to the variables, because a repr(arrow_dtype) cannot be evaluated again. This would need to use a similar ideas as https://github.com/vaexio/vaex/pull/2089 where we don't add types multiple times.

For the rest, I'll first wait for your response to https://github.com/vaexio/vaex/issues/2187#issuecomment-1232709280