sensorium-competition / experanto

Python package to interpolate recordings and stimuli of neuroscience experiments
MIT License
4 stars 5 forks source link

namedtuple is incompatible with num_workers>0 in PyTorch DataLoader #32

Open KonstantinWilleke opened 2 months ago

KonstantinWilleke commented 2 months ago

The __getitem__ of the datasets can not handle a namedtuple when there are multiple parallel workers. The parallel workers are needed to reach high data loading speeds for powerful compute nodes. This is a fundamental pytorch issue: because each worker instantiates a Dataset object, and the namedtuple is instantiated in each one, the parallel workers can't collate the batches in a "custom" named tuple.

Potential workarounds:

pollytur commented 2 weeks ago

Another potential solution is to do same as in neuralpredictors and define the namedtuple globally, then it works with the num_workers>0