Open YSXXXXXXX opened 9 months ago
It seems you're encountering an issue with multiprocessing, the error occurs when you're trying to use DataLoader with multiple workers (num_workers > 0), which involves pickling and unpickling data.
It can sometimes be influenced by the Python version in use. One thing you can try is to check the Python version, we are using Python 3.9. Secondly, a potential solution is to set the number of workers to 0, on line 166 in TransWorldNG\transworld\transworld_exp.py. This adjustment can disable multiprocessing for data loading. After which you may narrow down the cause of the problem.
Hi, lovelybirds,
Thank you for your reply.
I agree with your second method. But for the first one, in the issue description, as you see, my Python version is also 3.9. I think queue.Queue
(transworld/game/core/node.py
Line: 9) may be unsuitable for multiple workers.
Hi, lovelybirds, When I set the number of workers to 0,there is another problem:
Hi nudtdyk,
I noticed the error on line 218 with batch//num_workers. My apologies for suggesting worker=0. Please try worker = 1 to circumvent the integer division by zero issues.
We're in the process of creating a Docker environment with the same configurations. Hope this should help in preventing such issues in the future.
Describe the bug Hi, when I run the
transworld_exp.py
file, the following error occurs:To Reproduce run python file
TransWorldNG\transworld\transworld_exp.py
Expected behavior
train_loader
should return the sampled current and next timestamp graphs, in Line: 40.Desktop:
Possible solution I notice
w.start()
appears in the python traceback, so I check the objects that contain in parameterargs
(see the following Python statement), and findself._collate_fn
cannot be pickled.Maybe this error is associated with the class
Node
, which usesqueue.Queue
(thequeue.Queue
hasthread.lock
). I thinkcollections.deque
can be an alternative replacement. For more information, please see TypeError: can't pickle _thread.lock objects and Python Multiprocessing Pool.map Causes Error in __new__.