Open carmocca opened 1 year ago
I am migrating my code to PL 2 and it seems that for the val dataloader getting a batch to be of the form {"key_a": batch_dataloader_a, "key_b": batch_dataloader_b} is not implemented in PL 2 yet. Here my old code as a reference:
def val_dataloader(self):
val_dataloaders = {
key: DataLoader(
dataset,
batch_size=dataset.batch_size,
shuffle=False,
num_workers=dataset.num_workers,
pin_memory=False,
)
for key, dataset in self.val_datasets.items()
}
combined_val_loaders = CombinedLoader(val_dataloaders, "max_size_cycle")
return combined_val_loaders
@mees I added support for that in #17163, if you want to give it a try. The PR only implements it for validation and testing.
@mees I added support for that in #17163, if you want to give it a try. The PR only implements it for validation and testing.
really helpful! I hope this gets into "stable" soon.... or even the next release!
I really wish there was sequential
support in the training loop. Right now, it's not clear how one should handle batches of potentially different sizes in the training_step
. We'd have to collate inside the training_step
and ensure the given batch size is divided by the number of data loaders to keep gradient accumulation consistent etc. It gets pretty ugly.
@carmocca Thank you for your work on this issue. Not to rush you, but any update on the sequential
support in the training loop?
Thanks again!
Unfortunately, I dont have bandwidth to work on this now. If somebody wants to try, I can help getting the PR merged.
You can follow the structure in the EvaluationLoop. The training hooks will need an optional dataloader_idx
argument
@mees I added support for that in #17163, if you want to give it a try. The PR only implements it for validation and testing.
really helpful! I hope this gets into "stable" soon.... or even the next release!
Me too! Is there any release timeline / nightly version with this supported? I can't use lightning without this and really would like to leverage its other features!
Ditto! FYI for others pulling nightly will get the feature: https://github.com/Lightning-AI/lightning/pull/17163
Thanks! I also need this great feature.
+1, please release this feature asap!
Is this feature currently worked on?
As far as I know, nobody is currently working on it, Lukas
I would also really like this feature. I use CombinedDataloader to encapsulate modality-specific Dataloaders to recycle modalities with fewer data than our largest modality. For this reason, I use CombinedDatloader with "max_size_cycle"/"min_cycle" for train/validation and would like to be able for predict as well. Thanks for considering it!
Description & Motivation
trainer.fit
only works withCombinedLoader(..., mode="max_size_cycle"|"min_size")
trainer.{validate,test,predict}
only works withCombinedLoader(..., mode="sequential")
This constraint is checked in the top-level loops: https://github.com/Lightning-AI/lightning/blob/0009cde1db1a9ab4e2f1e0a9f69a4affb59d5134/src/lightning/pytorch/loops/fit_loop.py#L351-L354 https://github.com/Lightning-AI/lightning/blob/0009cde1db1a9ab4e2f1e0a9f69a4affb59d5134/src/lightning/pytorch/loops/evaluation_loop.py#L182-L183
Pitch
Have all trainer functions support all modes
TODO:
Alternatives
Not do it
Additional context
This builds on top of https://github.com/Lightning-AI/lightning/pull/16726
cc @borda @justusschock @awaelchli