Closed vadikmironov closed 3 years ago
Hi,
Thank you for reporting this. Actually issue you have reported is fixed in the latest version in master. I believe this is out in the latest alpha (not sure tho).
However, running your exact example raises another error:
~/vaex/packages/vaex-core/vaex/array_types.py in numpy_dtype_from_arrow_type(arrow_type, strict)
287 return map_arrow_to_numpy[arrow_type]
288 except KeyError:
--> 289 raise NotImplementedError(f'Cannot convert {arrow_type}')
290
291
NotImplementedError: Cannot convert date32[day]
This is because numpy
has only datetime64
, i.e. there is no such thing as datetime32
in numpy. A way around this is to force numpy to operate on the nanosecond level instead on the day level as in your example. This requires explicitly stating the dype like this (following your example)
arrays_numpy = {'label': np.array(['date1', 'date2']),
'date': np.array([np.datetime64('2021-10-01'), np.datetime64('2021-10-02')], dtype='datetime64[ns]')}
When converting to arrow now date64
will be used which can be converted back to numpy.datetime64
.
I hope this helps! J.
Thanks a lot. I'll close the issue now and check once the version is cut and published.
Actually, you may try the "fix" i described earlier (specify the ns type), it might work for your version already.
Thank you for reaching out and helping us improve Vaex!
Before you submit a new Issue, please read through the documentation. Also, make sure you search through the Open and Closed Issues - your problem may already be discussed or addressed.
Description When working with a dataset that have a date column, I've got an issue related to arrow Date32Array column type (describe fails with weird error about type conversion not implemented which is likely to be in pyarrow and is fair enough). However, when playing around with array_type='numpy' parameter I hit another issue which seems similar to https://github.com/vaexio/vaex/issues/1045 , but that issue was reported as closed in some early 4.0 alpha.
I've been able to narrow this down to the following small snippet that demonstrates the problem:
which fails with the following:
Software information
import vaex; vaex.__version__)
: {'vaex-core': '4.5.1'}