vaexio / vaex

Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
https://vaex.io
MIT License
8.25k stars 590 forks source link

docs: Add to supported data types table #2192

Open NickCrews opened 2 years ago

NickCrews commented 2 years ago

I was getting confused as to how my pyarrow timestamps were getting converted, and I think seeing this here would have pointed me in the right direction sooner.

JovanVeljanoski commented 1 year ago

About the timestamp array in arrow.. so in vaex i believe it is stored as arrow, it is not really converted to numpy. However, since many of the datetime and timedelta operations supported come via pandas, it is true that on the fly the data is cast to numpy (well pandas series).

But the data is still stored as arrow.

I don't know how to state this exactly in the short text for the purpose of that table so what you've written probably is fine.. @maartenbreddels ?

maartenbreddels commented 1 year ago

Indeed, it's being converted on the fly when doing many operations. How did it confused you @NickCrews ?

NickCrews commented 1 year ago

I don't remember exactly now 😭, should have been more detailed.

But I think it was constructing a DF from a pyarrow timestamp array, doing some operations on it, and then re-exporting with .values, and I was surprised that it had turned into a numpy array.