Open stsouko opened 1 year ago
Hi, are there any workarounds for this? I'm hitting it now.
I am pretty sure that when I pass in lengths
to pack_sequence
inside my collate_fn
it's a list of int
, the last dimension of a tensor's size
attribute. I haven't figured out how to a good way to debug.
Setting devices=1
does work for now but it would be nice to be able to run with DDP
. Thanks!
Bug description
Batches with PackedSequence's and DDP don't work. On single GPU everything is OK.
How to reproduce the bug
the structure of the batch.
Error messages and logs
Environment
Current environment
``` #- Lightning Component (e.g. Trainer, LightningModule, LightningApp, LightningWork, LightningFlow): #- PyTorch Lightning Version (e.g., 1.5.0): #- Lightning App Version (e.g., 0.5.2): #- PyTorch Version (e.g., 1.10): #- Python version (e.g., 3.9): #- OS (e.g., Linux): #- CUDA/cuDNN version: #- GPU models and configuration: #- How you installed Lightning(`conda`, `pip`, source): #- Running environment of LightningApp (e.g. local, cloud): ```More info
No response
cc @justusschock @awaelchli