Closed Aditya-Tandon closed 1 year ago
Hi, thanks for your interest in SidechainNet!
Can you share your notebook/the offending code? I'm not sure why multiprocessing is being called.
Hi, Here's a screenshot of the code with the error:
Thanks for helping!
Hmm, I'm sorry, but I'm unable to reproduce the issue you're experiencing. Let's try to figure this out.
Hi, Yes, I am using sidechainnet version 0.7.6.
I checked your colab notebbok and it's definitely the same as what I am doing.
Yeah sure, here's the error I receive on running the cell along with the cell that produces the error:
Code: for i in data2['train']: break
print(type(i))
Error:
AttributeError Traceback (most recent call last) Cell 4 in () ----> 1 for i in data2['train']: 2 break 4 print(type(i))
File /opt/anaconda3/lib/python3.9/site-packages/torch/utils/data/dataloader.py:444, in DataLoader.iter(self) 442 return self._iterator 443 else: --> 444 return self._get_iterator()
File /opt/anaconda3/lib/python3.9/site-packages/torch/utils/data/dataloader.py:390, in DataLoader._get_iterator(self) 388 else: 389 self.check_worker_number_rationality() --> 390 return _MultiProcessingDataLoaderIter(self)
File /opt/anaconda3/lib/python3.9/site-packages/torch/utils/data/dataloader.py:1077, in _MultiProcessingDataLoaderIter.init(self, loader) 1070 w.daemon = True 1071 # NB: Process.start() actually take some time as it needs to 1072 # start a process and pass the arguments over via a pipe. 1073 # Therefore, we only add a worker to self._workers list after 1074 # it started, so that we do not call .join() if program dies 1075 # before it starts, and del tries to join but will get: 1076 # AssertionError: can only join a started process. -> 1077 w.start() 1078 self._index_queues.append(index_queue) 1079 self._workers.append(w)
File /opt/anaconda3/lib/python3.9/multiprocessing/process.py:121, in BaseProcess.start(self) 118 assert not _current_process._config.get('daemon'), \ 119 'daemonic processes are not allowed to have children' 120 _cleanup() --> 121 self._popen = self._Popen(self) 122 self._sentinel = self._popen.sentinel 123 # Avoid a refcycle if the target function holds an indirect 124 # reference to the process object (see bpo-30775)
File /opt/anaconda3/lib/python3.9/multiprocessing/context.py:224, in Process._Popen(process_obj) 222 @staticmethod 223 def _Popen(process_obj): --> 224 return _default_context.get_context().Process._Popen(process_obj)
File /opt/anaconda3/lib/python3.9/multiprocessing/context.py:284, in SpawnProcess._Popen(process_obj) 281 @staticmethod 282 def _Popen(process_obj): 283 from .popen_spawn_posix import Popen --> 284 return Popen(process_obj)
File /opt/anaconda3/lib/python3.9/multiprocessing/popen_spawn_posix.py:32, in Popen.init(self, process_obj) 30 def init(self, process_obj): 31 self._fds = [] ---> 32 super().init(process_obj)
File /opt/anaconda3/lib/python3.9/multiprocessing/popen_fork.py:19, in Popen.init(self, process_obj) 17 self.returncode = None 18 self.finalizer = None ---> 19 self._launch(process_obj)
File /opt/anaconda3/lib/python3.9/multiprocessing/popen_spawn_posix.py:47, in Popen._launch(self, process_obj) 45 try: 46 reduction.dump(prep_data, fp) ---> 47 reduction.dump(process_obj, fp) 48 finally: 49 set_spawning_popen(None)
File /opt/anaconda3/lib/python3.9/multiprocessing/reduction.py:60, in dump(obj, file, protocol) 58 def dump(obj, file, protocol=None): 59 '''Replacement for pickle.dump() using ForkingPickler.''' ---> 60 ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'get_collate_fn..collate_fn'
I wonder if it might just be an issue with Mac as the same code gets executed perfectly when I run it on a linux machine with the same version of python and VS Code.
Oh interesting. Thanks for mentioning that. I really only develop this code on linux. I used to have continuous integration tests for Mac, but they died when Travis CI went private.
Do you have an M1 Mac? Others have reported issues with dataloaders on Apple Silicon (example). In that thread, they suggest setting num_workers
to 0. You can try this in scn.load()
.
Edit: I have also seen discussions talking about using multiprocessing fork instead of the default, spawn.
Yes, I have an M1 Mac. Both setting num_workers to 0 and setting multiprocessing to fork fixes the issue.
Thanks for the help! :)
I am getting a pickle error while trying to load the DataLoader. Works fine on colab but not on my Mac. Unsure what the cause might be. Here's a screenshot of the error:
This is the code snippet causing the error:
data = [] for batch in d["train"]: data.append(batch)