Closed nosound2 closed 4 years ago
same issue , anyone found the solution ??
Hi @nosound2 and @winklemad , a bit of context about why we left out PyTorch 1.6 from requirements: As the competition was approaching quickly we didn't want to include a potential breaking change in dependencies and we didn't have enough time to test it thoroughly.
That being said, this is probably a windows related issue (there is a section in the readme). Your best chance is probably to:
setup.py
You can also do the same by cloning and installing in editable mode if you want to have a local reactive copy
Thanks, @lucabergamini , removing the dependencies worked for me. I did the cloning thing, and yes, I am on Windows.
I needed to do one more thing.
from l5kit.dataset import EgoDataset, AgentDataset
failed for me on line
multiprocessing.set_start_method("fork", force=True) # this fix loop in python 3.8 on MacOS
apparently because there is no fork on Windows.
I removed that line from the source file and at least some functionality seems to run now.
multiprocessing.set_start_method("fork", force=True) # this fix loop in python 3.8 on MacOS
Thanks for letting me know, I'll put a platform check there!
@lucabergamini I still have the error for the fork/broken pipe problem. Could you have a look. I am running Windows 10, Torch 1.5.1, Python 3.7.9 l5kit 1.6 and I have also tried l5kit master& l5kit 1.5 after the fix. Think running Linux is the only solution for now.
> ---------------------------------------------------------------------------
> BrokenPipeError Traceback (most recent call last)
> <ipython-input-40-b7d59c605d01> in <module>
> 1 # ==== TRAIN LOOP
> ----> 2 tr_it = iter(train_dataloader)
> 3 progress_bar = tqdm(range(cfg["train_params"]["max_num_steps"]))
> 4 losses_train = []
> 5 for _ in progress_bar:
>
> c:\python37\lib\site-packages\torch\utils\data\dataloader.py in __iter__(self)
> 277 return _SingleProcessDataLoaderIter(self)
> 278 else:
> --> 279 return _MultiProcessingDataLoaderIter(self)
> 280
> 281 @property
>
> c:\python37\lib\site-packages\torch\utils\data\dataloader.py in __init__(self, loader)
> 717 # before it starts, and __del__ tries to join but will get:
> 718 # AssertionError: can only join a started process.
> --> 719 w.start()
> 720 self._index_queues.append(index_queue)
> 721 self._workers.append(w)
>
> c:\python37\lib\multiprocessing\process.py in start(self)
> 110 'daemonic processes are not allowed to have children'
> 111 _cleanup()
> --> 112 self._popen = self._Popen(self)
> 113 self._sentinel = self._popen.sentinel
> 114 # Avoid a refcycle if the target function holds an indirect
>
> c:\python37\lib\multiprocessing\context.py in _Popen(process_obj)
> 221 @staticmethod
> 222 def _Popen(process_obj):
> --> 223 return _default_context.get_context().Process._Popen(process_obj)
> 224
> 225 class DefaultContext(BaseContext):
>
> c:\python37\lib\multiprocessing\context.py in _Popen(process_obj)
> 320 def _Popen(process_obj):
> 321 from .popen_spawn_win32 import Popen
> --> 322 return Popen(process_obj)
> 323
> 324 class SpawnContext(BaseContext):
>
> c:\python37\lib\multiprocessing\popen_spawn_win32.py in __init__(self, process_obj)
> 87 try:
> 88 reduction.dump(prep_data, to_child)
> ---> 89 reduction.dump(process_obj, to_child)
> 90 finally:
> 91 set_spawning_popen(None)
>
> c:\python37\lib\multiprocessing\reduction.py in dump(obj, file, protocol)
> 58 def dump(obj, file, protocol=None):
> 59 '''Replacement for pickle.dump() using ForkingPickler.'''
> ---> 60 ForkingPickler(file, protocol).dump(obj)
> 61
> 62 #
>
> BrokenPipeError: [Errno 32] Broken pipe
@lucabergamini I still have the error for the fork/broken pipe problem. Could you have a look. I am running Windows 10, Torch 1.5.1, Python 3.7.9 l5kit 1.6 and I have also tried l5kit master& l5kit 1.5 after the fix. Think running Linux is the only solution for now.
> --------------------------------------------------------------------------- > BrokenPipeError Traceback (most recent call last) > <ipython-input-40-b7d59c605d01> in <module> > 1 # ==== TRAIN LOOP > ----> 2 tr_it = iter(train_dataloader) > 3 progress_bar = tqdm(range(cfg["train_params"]["max_num_steps"])) > 4 losses_train = [] > 5 for _ in progress_bar: > > c:\python37\lib\site-packages\torch\utils\data\dataloader.py in __iter__(self) > 277 return _SingleProcessDataLoaderIter(self) > 278 else: > --> 279 return _MultiProcessingDataLoaderIter(self) > 280 > 281 @property > > c:\python37\lib\site-packages\torch\utils\data\dataloader.py in __init__(self, loader) > 717 # before it starts, and __del__ tries to join but will get: > 718 # AssertionError: can only join a started process. > --> 719 w.start() > 720 self._index_queues.append(index_queue) > 721 self._workers.append(w) > > c:\python37\lib\multiprocessing\process.py in start(self) > 110 'daemonic processes are not allowed to have children' > 111 _cleanup() > --> 112 self._popen = self._Popen(self) > 113 self._sentinel = self._popen.sentinel > 114 # Avoid a refcycle if the target function holds an indirect > > c:\python37\lib\multiprocessing\context.py in _Popen(process_obj) > 221 @staticmethod > 222 def _Popen(process_obj): > --> 223 return _default_context.get_context().Process._Popen(process_obj) > 224 > 225 class DefaultContext(BaseContext): > > c:\python37\lib\multiprocessing\context.py in _Popen(process_obj) > 320 def _Popen(process_obj): > 321 from .popen_spawn_win32 import Popen > --> 322 return Popen(process_obj) > 323 > 324 class SpawnContext(BaseContext): > > c:\python37\lib\multiprocessing\popen_spawn_win32.py in __init__(self, process_obj) > 87 try: > 88 reduction.dump(prep_data, to_child) > ---> 89 reduction.dump(process_obj, to_child) > 90 finally: > 91 set_spawning_popen(None) > > c:\python37\lib\multiprocessing\reduction.py in dump(obj, file, protocol) > 58 def dump(obj, file, protocol=None): > 59 '''Replacement for pickle.dump() using ForkingPickler.''' > ---> 60 ForkingPickler(file, protocol).dump(obj) > 61 > 62 # > > BrokenPipeError: [Errno 32] Broken pipe
@lucabergamini I fixed this with guideline in python documentation. Ref: https://docs.python.org/2/library/multiprocessing.html#multiprocessing-programming
@deepakrajpurushothaman , this is a different issue, you need to set num_workers=0
. Windows does not support multi-processing in torch dataloader.
When installing the package I get the following output:
My torch versions are the following:
It seems like torch versions are not in range indeed. Does it mean I have to downgrade the torch?
Thanks!