catalyst-team / catalyst

Accelerated deep learning R&D
https://catalyst-team.com
Apache License 2.0
3.28k stars 388 forks source link

21.03 breaks with BatchPrefetchLoaderWrapper #1149

Closed Entodi closed 3 years ago

Entodi commented 3 years ago

🐛 Bug Report

loaders = get_loaders_from_params(initial_seed=initial_seed, **loaders) seems does not work with BatchPrefetchLoaderWrapper

How To Reproduce

Steps to reproduce the behavior: run the following code sample

Code sample

import os
from torch import nn, optim
from torch.utils.data import DataLoader
from catalyst import dl, utils
from catalyst.data.transforms import ToTensor
from catalyst.contrib.datasets import MNIST
from catalyst.data.loader import BatchPrefetchLoaderWrapper

model = nn.Sequential(nn.Flatten(), nn.Linear(28 * 28, 10))
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.02)

loaders = {
    "train": DataLoader(
        MNIST(os.getcwd(), train=True, download=True, transform=ToTensor()), batch_size=32
    ),
    "valid": DataLoader(
        MNIST(os.getcwd(), train=False, download=True, transform=ToTensor()), batch_size=32
    ),
}
for key in loaders.keys():
    loaders[key] = BatchPrefetchLoaderWrapper(
        loaders[key], num_prefetches=32)

runner = dl.SupervisedRunner(
    input_key="features", output_key="logits", target_key="targets", loss_key="loss"
)
# model training
runner.train(
    model=model,
    criterion=criterion,
    optimizer=optimizer,
    loaders=loaders,
    num_epochs=1,
    callbacks=[
        dl.AccuracyCallback(input_key="logits", target_key="targets", topk_args=(1, 3, 5)),
        dl.PrecisionRecallF1SupportCallback(
            input_key="logits", target_key="targets", num_classes=10
        ),
        dl.AUCCallback(input_key="logits", target_key="targets"),
        # catalyst[ml] required ``pip install catalyst[ml]``
        # dl.ConfusionMatrixCallback(input_key="logits", target_key="targets", num_classes=10),
    ],
    logdir="./logs",
    valid_loader="valid",
    valid_metric="loss",
    minimize_valid_metric=True,
    verbose=True,
    load_best_on_end=True,
)

Screenshots

Traceback (most recent call last): File "catalyst_prefetch.py", line 29, in runner.train( File "/data/users2/afedorov/trends/anaconda3/envs/lucy/lib/python3.8/site-packages/catalyst/runners/runner.py", line 320, in train self.run() File "/data/users2/afedorov/trends/anaconda3/envs/lucy/lib/python3.8/site-packages/catalyst/core/runner.py", line 678, in run self._run_event("on_exception") File "/data/users2/afedorov/trends/anaconda3/envs/lucy/lib/python3.8/site-packages/catalyst/core/runner.py", line 601, in _run_event getattr(self, event)(self) File "/data/users2/afedorov/trends/anaconda3/envs/lucy/lib/python3.8/site-packages/catalyst/core/runner.py", line 593, in on_exception raise self.exception File "/data/users2/afedorov/trends/anaconda3/envs/lucy/lib/python3.8/site-packages/catalyst/core/runner.py", line 675, in run self._run_experiment() File "/data/users2/afedorov/trends/anaconda3/envs/lucy/lib/python3.8/site-packages/catalyst/core/runner.py", line 665, in _run_experiment self._run_stage() File "/data/users2/afedorov/trends/anaconda3/envs/lucy/lib/python3.8/site-packages/catalyst/core/runner.py", line 646, in _run_stage self._run_event("on_stage_start") File "/data/users2/afedorov/trends/anaconda3/envs/lucy/lib/python3.8/site-packages/catalyst/core/runner.py", line 597, in _run_event getattr(self, event)(self) File "/data/users2/afedorov/trends/anaconda3/envs/lucy/lib/python3.8/site-packages/catalyst/core/runner.py", line 502, in on_stage_start self._setup_loaders() File "/data/users2/afedorov/trends/anaconda3/envs/lucy/lib/python3.8/site-packages/catalyst/core/runner.py", line 460, in _setup_loaders loaders = self.get_loaders(stage=self.stage_key) File "/data/users2/afedorov/trends/anaconda3/envs/lucy/lib/python3.8/site-packages/catalyst/runners/runner.py", line 147, in get_loaders self._loaders = _process_loaders(loaders=self._loaders, initial_seed=self.seed) File "/data/users2/afedorov/trends/anaconda3/envs/lucy/lib/python3.8/site-packages/catalyst/runners/runner.py", line 45, in _process_loaders loaders = get_loaders_from_params(initial_seed=initial_seed, **loaders) File "/data/users2/afedorov/trends/anaconda3/envs/lucy/lib/python3.8/site-packages/catalyst/utils/data.py", line 142, in get_loaders_from_params assert isinstance( AssertionError: <catalyst.data.loader.BatchPrefetchLoaderWrapper object at 0x7f9aa4e6d310> should be Dataset or Dict. Got: <catalyst.data.loader.BatchPrefetchLoaderWrapper object at 0x7f9aa4e6d310>

Expected behavior

The code should be training.

Environment

Catalyst version: 21.03.2 PyTorch version: 1.8.1 Is debug build: No CUDA used to build PyTorch: 11.1 TensorFlow version: N/A TensorBoard version: N/A

OS: linux GCC version: (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39) CMake version: Could not collect

Python version: 3.8 Is CUDA available: Yes CUDA runtime version: Could not collect GPU models and configuration: GPU 0: GeForce RTX 2080 Ti Nvidia driver version: 455.32.00 cuDNN version: Could not collect

Versions of relevant libraries: [pip3] catalyst==21.3.2 [pip3] numpy==1.20.2 [pip3] tensorboardX==2.1 [pip3] torch==1.8.1 [pip3] torchaudio==0.8.0a0+e4e171a [pip3] torchfile==0.1.0 [pip3] torchnet==0.0.4 [pip3] torchvision==0.9.1 [conda] blas 1.0 mkl
[conda] catalyst 21.3.2 pypi_0 pypi [conda] cudatoolkit 11.1.1 h6406543_8 conda-forge [conda] ffmpeg 4.3 hf484d3e_0 pytorch [conda] mkl 2020.4 h726a3e6_304 conda-forge [conda] mkl-service 2.3.0 py38h1e0a361_2 conda-forge [conda] mkl_fft 1.3.0 py38h5c078b8_1 conda-forge [conda] mkl_random 1.2.0 py38hc5bc63f_1 conda-forge [conda] numpy 1.20.2 pypi_0 pypi [conda] pytorch 1.8.1 py3.8_cuda11.1_cudnn8.0.5_0 pytorch [conda] tensorboardx 2.1 pypi_0 pypi [conda] torchaudio 0.8.1 py38 pytorch [conda] torchfile 0.1.0 pypi_0 pypi [conda] torchnet 0.0.4 pypi_0 pypi [conda] torchvision 0.9.1 py38_cu111 pytorch

Additional context

Checklist

Scitator commented 3 years ago

could you please do the following as a hotfix:

import os
from torch import nn, optim
from torch.utils.data import DataLoader
from catalyst import dl, utils
from catalyst.data.transforms import ToTensor
from catalyst.contrib.datasets import MNIST
from catalyst.data.loader import BatchPrefetchLoaderWrapper

model = nn.Sequential(nn.Flatten(), nn.Linear(28 * 28, 10))
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.02)

loaders = {
    "train": DataLoader(
        MNIST(os.getcwd(), train=True, download=True, transform=ToTensor()), batch_size=32
    ),
    "valid": DataLoader(
        MNIST(os.getcwd(), train=False, download=True, transform=ToTensor()), batch_size=32
    ),
}
for key in loaders.keys():
    loaders[key] = BatchPrefetchLoaderWrapper(
        loaders[key], num_prefetches=32)

class CustomSupervisedRunner(dl.SupervisedRunner):
    def get_loaders(self, stage):
        return self._loaders

runner = CustomSupervisedRunner(
    input_key="features", output_key="logits", target_key="targets", loss_key="loss"
)
# model training
runner.train(
    model=model,
    criterion=criterion,
    optimizer=optimizer,
    loaders=loaders,
    num_epochs=1,
    callbacks=[
        dl.AccuracyCallback(input_key="logits", target_key="targets", topk_args=(1, 3, 5)),
        dl.PrecisionRecallF1SupportCallback(
            input_key="logits", target_key="targets", num_classes=10
        ),
        dl.AUCCallback(input_key="logits", target_key="targets"),
        # catalyst[ml] required ``pip install catalyst[ml]``
        # dl.ConfusionMatrixCallback(input_key="logits", target_key="targets", num_classes=10),
    ],
    logdir="./logs",
    valid_loader="valid",
    valid_metric="loss",
    minimize_valid_metric=True,
    verbose=True,
    load_best_on_end=True,
)

? We will drop a correct solution next release ;)

Scitator commented 3 years ago

Fixed with 21.04 ;)