kkoutini / PaSST

Efficient Training of Audio Transformers with Patchout
Apache License 2.0
287 stars 48 forks source link

EOF (End Of File) Error on num_workers>0 #48

Open Rishabh-S1899 opened 3 months ago

Rishabh-S1899 commented 3 months ago

I am trying to finetune the model on DCASE2020 dataset. I have prepared the sample ex_dcase.y file and dataset.py file inspired from esc50 dataset but whenever I increase the num_workers in train or test dataloader, I recieve the EOF File error. Basically 2 errors arise namely: Traceback (most recent call last): File "", line 1, in File "path\venv\lib\multiprocessing\spawn.py", line 116, in spawn_main exitcode = _main(fd, parent_sentinel) File "path\venv\lib\multiprocessing\spawn.py", line 126, in _main self = reduction.pickle.load(from_parent) EOFError: Ran out of input ERROR - passt_Dcase2020 - Failed after 0:00:12!

Also the following error : Traceback (most recent calls WITHOUT Sacred internals): File "ex_dcase.py", line 436, in default_command return main() File "ex_dcase.py", line 275, in main trainer.fit( File "path\venv\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 608, in fit call._call_and_handle_interrupt( File "path\venv\lib\multiprocessing\popen_spawn_win32.py", line 93, in init reduction.dump(process_obj, to_child) File "path\venv\lib\multiprocessing\reduction.py", line 60, in dump ForkingPickler(file, protocol).dump(obj) AttributeError: Can't pickle local object 'get_roll_func..roll_func'

Can you help me fix the above error, or suggest any changes that could work ?

kkoutini commented 3 months ago

Unfortuently, spawn start method for worker is not supported in the framework becasue sacred object can't be pickled. Take a look at https://docs.python.org/3/library/multiprocessing.html#multiprocessing.set_start_method where you can set the start method to fork