Lightning-AI / pytorch-lightning

Pretrain, finetune ANY AI model of ANY size on multiple GPUs, TPUs with zero code changes.
https://lightning.ai
Apache License 2.0
28.53k stars 3.39k forks source link

Avoid network access during ordinary tests #11422

Open carmocca opened 2 years ago

carmocca commented 2 years ago

Proposed refactor

Some of our tests use the MNIST and TrialMNIST classes which download the MNIST data off the internet (or from a cache).

https://github.com/PyTorchLightning/pytorch-lightning/blob/948cfd24de4f64a2980395581f15544e5e37eab0/tests/helpers/datasets.py#L25

https://github.com/PyTorchLightning/pytorch-lightning/blob/948cfd24de4f64a2980395581f15544e5e37eab0/tests/helpers/datasets.py#L134

Motivation

We should avoid this to reduce the inherent flakiness of network access.

Pitch

If a test uses these classes, do one of 3 options:

Additional context

Master is currently blocked due to errors while reading the MNIST data zipfiles.


If you enjoy Lightning, check out our other projects! ⚡

cc @borda @akihironitta

Borda commented 2 years ago

Mnist and TrialMnist shall be cached in our CI, so there is no internet connectivity issue...

akihironitta commented 2 years ago

@carmocca I see that failing ones (in the conda CI e.g. this workflow) are the followings

tests/helpers/datasets.py::tests.helpers.datasets.TrialMNIST
tests/helpers/test_datasets.py::test_pickling_dataset_mnist[TrialMNIST-args1]
tests/helpers/test_models.py::test_models[None-BasicGAN]
tests/models/test_horovod.py::test_horovod_multi_optimizer

and I think we ~can~ should remove

tests/helpers/datasets.py::tests.helpers.datasets.TrialMNIST
tests/helpers/test_datasets.py::test_pickling_dataset_mnist[TrialMNIST-args1]  # which does nothing really

and replace MNIST with RandomDataset in the rest of the tests.

akihironitta commented 2 years ago

Mnist and TrialMnist shall be cached in our CI, so there is no internet connectivity issue...

@Borda Seems like they are not cached in the conda workflow: https://github.com/PyTorchLightning/pytorch-lightning/actions/runs/1681919780/workflow

carmocca commented 2 years ago

there is no internet connectivity issue...

Sure, but it adds complexity and time. The purpose of each of these ordinary tests should not need the actual data, so if we can make it simpler, it is worth it to me.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it hasn't had any recent activity. This issue will be closed in 7 days if no further activity occurs. Thank you for your contributions, Pytorch Lightning Team!