Open Ben-Epstein opened 1 year ago
You should be able to reproduce it here https://colab.research.google.com/drive/1J085UZolLNcaL8zhVKY0LQzbgFMXnYur?usp=sharing
(Also in the notebook above) When you export to arrow, it works fine, takes about 1 minute, and the memory stays very low
Thank you for reaching out and helping us improve Vaex!
Before you submit a new Issue, please read through the documentation. Also, make sure you search through the Open and Closed Issues - your problem may already be discussed or addressed.
Description Please provide a clear and concise description of the problem. This should contain all the steps needed to reproduce the problem. A minimal code example that exposes the problem is very appreciated.
Software information
import vaex; vaex.__version__)
: 4.16.1Additional information If you run this on a limited machine like google colab free, you will get a OOM crash when exporting to hdf5, even though it works fine exporting to arrow. We need to convert the string to a large_string because of pyarrow issues https://issues.apache.org/jira/browse/ARROW-17828