Open johnathanchiu opened 1 year ago
I am currently using CombinedLoader (https://lightning.ai/docs/pytorch/stable/api/lightning.pytorch.utilities.combined_loader.html#lightning.pytorch.utilities.combined_loader.CombinedLoader) to combine multiple datasets. It works fine but I noticed that setting the dataloader with num_workers > 0 causes it to run extremely slow. Is there a logical explanation for this? Feel like this could be a bug otherwise. I attached a chunk of my code to show what I am doing.
CombinedLoader
num_workers > 0
v2.0
import pytorch_lightning as pl from lightning.pytorch.utilities.combined_loader import CombinedLoader class CollectiveDataloader(pl.LightningDataModule): def __init__(self, datasets, num_workers=8, batch_size=10, shuffle=True): super().__init__() self.train_set = CollectiveDataset( datasets, num_workers, batch_size, shuffle ).datasets def train_dataloader(self): return CombinedLoader(self.train_set, "sequential") class CollectiveDataset: def __init__(self, datasets, num_workers, batch_size, shuffle): # datasets is a dictionary of {dataset_name : Dataset object} loaded_datasets = { name: DataLoader( dataset, batch_size=batch_size, shuffle=shuffle, ### SETTING THIS > 0 RUNS REALLY SLOW ### num_workers=num_workers, ) for name, dataset in datasets.items() } self.datasets = loaded_datasets
cc @borda
Same issue here
Bug description
I am currently using
CombinedLoader
(https://lightning.ai/docs/pytorch/stable/api/lightning.pytorch.utilities.combined_loader.html#lightning.pytorch.utilities.combined_loader.CombinedLoader) to combine multiple datasets. It works fine but I noticed that setting the dataloader withnum_workers > 0
causes it to run extremely slow. Is there a logical explanation for this? Feel like this could be a bug otherwise. I attached a chunk of my code to show what I am doing.What version are you seeing the problem on?
v2.0
How to reproduce the bug
Environment
Current environment
``` #- PyTorch Lightning Version: 2.0.7 #- PyTorch Version: 2.0.1 #- Python version: 3.10.12 #- OS: Linux #- CUDA/cuDNN version: 12.0 ```cc @borda