ContinualAI / avalanche

Avalanche: an End-to-End Library for Continual Learning based on PyTorch.
http://avalanche.continualai.org
MIT License
1.79k stars 292 forks source link

KeyError in multihead.py example when not using num_workers on MPS #1656

Open guilhermegog opened 4 months ago

guilhermegog commented 4 months ago

Hey everyone, first off thank you for developing this amazing library.

🐛 Describe the bug So apparently there is a bug, which I am not sure if it is a bug related to avalanche or mps, but if you attempt to run the multihead example without the num_workers parameter set (which I would assume would default to 1), the training loop stops working and a KeyError is raised, with a seemingly random key.

🐜 To Reproduce In lines 71 and 72 of the multihead.py example just replace with the following lines: strategy.train(train_task) strategy.eval(test_stream)

🦋 Additional context The issue only seems to occur when running the script on 'mps' devices, as when running the same piece of code on a server with cuda the issue does not persust