vaexio / vaex

Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
https://vaex.io
MIT License
8.22k stars 589 forks source link

ModuleNotFoundError: No module named 'vaex.hdf5' #2376

Closed HajimeKawahara closed 1 year ago

HajimeKawahara commented 1 year ago

Many thanks for developing the great package! We have been using vaex in our project, ExoJAX (through radis) for the past year or so and have been plagued with the following installation error from time to time, depending on our environment.

E       ModuleNotFoundError: No module named 'vaex.hdf5'

I get this error especially with CI in github actions. Here is an example.

In my local environments, I sometimes get this error, but I have not been able to figure out in what order this error occurs when installed. Usually, removing vaex_(***) packages (using pip) and the reinstallation of vaex (using pip) solves this error. Also, It did not seem to occur when pip install vaex (4.16.0) was done first from the latest anaconda clean installation (Anaconda3-2023.03-1-Linux-x86_64).

Does anyone have a similar experience? Or could you give us some information that could give us some clues?

HajimeKawahara commented 1 year ago

I was able to reproduce the error in my local environment and found that there was no vaex directory in site-packages:

shirochan:~>ls -l anaconda3/lib/python3.10/site-packages/vaex
vaex-4.16.0-py3.10.egg/                   vaex_jupyter-0.8.1-py3.10.egg/
vaex_astro-0.9.3-py3.10.egg/              vaex_ml-0.18.1-py3.10.egg/
vaex_core-4.16.1-py3.10-linux-x86_64.egg/ vaex_server-0.8.1-py3.10.egg/
vaex_hdf5-0.14.1-py3.10.egg/              vaex_viz-0.5.4-py3.10.egg/
HajimeKawahara commented 1 year ago

A tentative solution is to add the manual un-installation of vaex-(***) and re-installation of vaex at the end of setup.py, although it's not smart.

import subprocess
import sys

def uninstall(package):
    subprocess.check_call([sys.executable, "-m", "pip", "uninstall", "-y", package])

def reinstall(package):
    subprocess.check_call([sys.executable, "-m", "pip", "uninstall", "-y", package])
    subprocess.check_call([sys.executable, "-m", "pip", "install", package])

uninstall('vaex-core')
uninstall('vaex-astro')
uninstall('vaex-jupyter')
uninstall('vaex-ml')
uninstall('vaex-hdf5')
uninstall('vaex-server')
uninstall('vaex-viz')

reinstall('vaex')