Open sungreong opened 2 years ago
@sungreong @maartenbreddels I'm not sure how/why this is happening but I have a quick workaround for you...
For some reason your int classes (class
and class1
) are having their types destroyed. Vaex somehow thinks they are dictionaries, which explains the error above and why you cannot filter.
The float columns seem fine, and you can filter on them
So you can cast to an int and itll work fine
I don't get why it's happening however, but I'm sure Maarten will :)
It has something to do with the partition cols @maartenbreddels because if you remove those then the filter and the dtypes work as expected
Thanks But I do not understand partitioning fuction in vaex.open Can you write the example using partitioning in vaex.open
@sungreong If you don't mind having the data in memory, just open it via arrow as in your example, then use vaex.from_arrow_table
to pass it to vaex.
Otherwise the example from @Ben-Epstein is the way I would approach it also.
Description Hello, I want to partially import a parquet file partitioned from vaex. However, looking at the documentation, there is no such thing, so I am posting it in a bug report. In pyarrow, you can extract rows from patitioned data using the filter function. However, vaex couldn't find these functions, so I'm asking.
In other words, I wonder if it is possible to use the filter function supported by pyarrow in vaex and for examples.
Software information
Additional information
generate data (paritioning)
filter
filter data using pyarrow parquet
output : (20106, 7)
filter data using vaex (I am not sure )
output : (100000, 7)
filter data using vaex (just try)