Examining the installation size of soundscapy, we end up quite big for comparable packages. We should try to reduce this where possible, either by removing unnecessary dependencies or making use of optional dependencies.
For instance:
soundscapy[audio]: 444 MB installed
Plotly: 100 MB installed
sklearn: 135 MB installed
pandas[all]: 688 MB
The largest dependencies in soundscapy[audio] are:
89M plotly
89M llvmlite
72M scipy
41M pandas
28M skimage
19M numpy
19M matplotlib
13M numba
11M fontTools
9.5M PIL
8.9M jupyterlab_plotly
7.2M networkx
3.9M pydantic_core
3.3M resampy
2.6M pytz
2.5M tzdata
2.1M plotly-5.24.1.dist-info
2.0M soundscapy
1.7M python_calamine
1.7M pydantic
1.3M openpyxl
1.2M imageio
1.1M seaborn
Obviously some are unavoidable - SciPy, pandas, matplotlib, seaborn, etc. And just soundscapy isn't too bad. The issue is with some of the dependencies, especially plotly, llvmlite, skimage.
Plotly should be straightforward to now make an optional dependency - ship seaborn as standard, and if users would like to use the plotly backend, they can install it.
llvmlite is only required by pandas[performance] and I was planning to take the performance tag out anyway, so that's easy.
scikit-maad is the source of numba (via resampy) and scikit-image. I don't want to make optional dependencies as granular as separating psychoacoustics, ecoacoustics, etc. So unless scikit-maad drops these, then there's not much we can do. But at the least, without the [audio] optionals, these shouldn't be necessary.
From testing, it looks like just removing pandas [performance] would reduce us to ~300M for the core install. Removing plotly goes down to ~200M. This still feels quite big though...
Removing SciPy would reduce further to ~123M. It's not required by other core dependencies, although it is for acoustics, mosqito, and scikit-maad. It doesn't look feasible to remove it from the code though - we use it for SSM curve fit optimization and for stats.kurtosis. The stats could probably be done with a smaller dependency, but optimise would be difficult to change.
Examining the installation size of soundscapy, we end up quite big for comparable packages. We should try to reduce this where possible, either by removing unnecessary dependencies or making use of optional dependencies.
For instance:
The largest dependencies in soundscapy[audio] are:
Obviously some are unavoidable - SciPy, pandas, matplotlib, seaborn, etc. And just soundscapy isn't too bad. The issue is with some of the dependencies, especially plotly, llvmlite, skimage.
Plotly should be straightforward to now make an optional dependency - ship seaborn as standard, and if users would like to use the plotly backend, they can install it.
llvmlite is only required by pandas[performance] and I was planning to take the performance tag out anyway, so that's easy.
scikit-maad is the source of numba (via resampy) and scikit-image. I don't want to make optional dependencies as granular as separating psychoacoustics, ecoacoustics, etc. So unless scikit-maad drops these, then there's not much we can do. But at the least, without the [audio] optionals, these shouldn't be necessary.
From testing, it looks like just removing pandas [performance] would reduce us to ~300M for the core install. Removing plotly goes down to ~200M. This still feels quite big though...
Removing SciPy would reduce further to ~123M. It's not required by other core dependencies, although it is for acoustics, mosqito, and scikit-maad. It doesn't look feasible to remove it from the code though - we use it for SSM curve fit optimization and for stats.kurtosis. The stats could probably be done with a smaller dependency, but optimise would be difficult to change.