microsoft / Semi-supervised-learning

A Unified Semi-Supervised Learning Codebase (NeurIPS'22)
https://usb.readthedocs.io
MIT License
1.33k stars 176 forks source link

Reduce the complexity of dependencies #156

Closed adamtupper closed 1 year ago

adamtupper commented 1 year ago

πŸš€ Feature

Pare down the requirements (i.e., requirements.txt) to only include "top-level" packages (i.e. those that are actually used and are not dependencies of said packages). Furthermore, instead of pinning the versions of these packages, specify minimum versions instead.

Motivation

Over-specifying dependencies in the requirements makes using the library out-of-the-box on systems where you don't have full control of the available dependencies much trickier. This is very common when trying to use public/enterprise computing clusters (e.g. Compute Canada). Furthermore, tying sub-dependencies to specific versions while leaving the versions of core dependencies flexible (e.g. PyTorch, NumPy, etc.) leads to mismatching dependency errors. This will become increasingly problematic as the core dependencies are updated.

I understand the desire to fix the versions of everything for reproducibility. However, when the versions sub-dependencies are fixed but not the versions of core dependencies (again, PyTorch etc.) fixing the versions of these sub-dependencies is redundant.

Pitch

Already, core dependencies (e.g., PyTorch) are specified using minimum versions. I propose removing sub-dependencies from requirements.txt and specifying the remaining core dependencies as minimum versions (with the currently specified version as the new minimum).

The resulting requirements file would look like this:

matplotlib>=3.5.2
numpy
Pillow>=9.0.0
progress>=1.6
ruamel.yaml>=0.17.21
ruamel.yaml.clib>=0.2.6
scikit-image>=0.19.3
scikit-learn>=1.0.2
scipy>=1.10.0
tensorboard>=2.9.1
timm>=0.5.4
torch>=1.12.0
torchaudio>=0.12.0
torchvision>=0.13.0
tqdm>=4.64.0
transformers>=4.30.0
wandb
aim

I've already tested this to check that the training script still runs. It only requires one change to vit.py, changing the import path of to_2tuple to: from timm.models.layers import to_2tuple.

Alternatives

None.

Additional context

None.

Hhhhhhao commented 1 year ago

Sounds good. This would be a necessary update.

adamtupper commented 1 year ago

Great! I'll submit a PR πŸ‘