sdpython / pandas-streaming

Streaming API for pandas applied to big datasets
https://sdpython.github.io/doc/pandas-streaming/dev/
MIT License
29 stars 9 forks source link

Missing packages: scikit-learn and ijson #39

Closed stephengmatthews closed 2 months ago

stephengmatthews commented 4 months ago

Pip doesn't install all required packages. It seems that scikit-learn and ijson are missing.

The steps to demonstrate that scikit-learn is missing:

python -m venv venv
source venv/bin/activate
pip install pandas-streaming
python -c "import pandas_streaming.df"

And, the output:

ModuleNotFoundError: No module named 'sklearn'

The steps to demonstrate that ijson is missing:

pip install scikit-learn
python -c "import pandas_streaming.df"

And, the output:

ModuleNotFoundError: No module named 'ijson'
ybressler commented 4 months ago

I had the same issue. Consider storing dependencies explicitly in pyproject.toml. (I personally like poetry as a dep manager)