woven-planet / l5kit

L5Kit - https://woven.toyota
https://woven-planet.github.io/l5kit
857 stars 278 forks source link

Not able to install the package due to torch version #123

Closed nosound2 closed 4 years ago

nosound2 commented 4 years ago

When installing the package I get the following output:

(base) C:\StudioProjects> pip install l5kit
Collecting l5kit
  Using cached l5kit-1.0.6-py3-none-any.whl (81 kB)
Requirement already satisfied: setuptools in c:\users\nosou\anaconda3\lib\site-packages (from l5kit) (49.6.0.post20200814)
Collecting pymap3d
  Using cached pymap3d-2.4.1.tar.gz (30 kB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
    Preparing wheel metadata ... done
Requirement already satisfied: tqdm in c:\users\nosou\anaconda3\lib\site-packages (from l5kit) (4.48.2)
Requirement already satisfied: imageio in c:\users\nosou\anaconda3\lib\site-packages (from l5kit) (2.9.0)
Requirement already satisfied: scipy in c:\users\nosou\anaconda3\lib\site-packages (from l5kit) (1.3.1)
Collecting ptable
  Downloading PTable-0.9.2.tar.gz (31 kB)
Requirement already satisfied: matplotlib in c:\users\nosou\anaconda3\lib\site-packages (from l5kit) (3.3.1)
ERROR: Could not find a version that satisfies the requirement torch<1.6.0,>=1.5.0 (from l5kit) (from versions: 0.1.2, 0.1.2.post1, 0.1.2.post2)
ERROR: No matching distribution found for torch<1.6.0,>=1.5.0 (from l5kit)

My torch versions are the following:

(base) C:\StudioProjects> pip freeze |grep torch
efficientnet-pytorch==0.6.3
facenet-pytorch==1.0.1
torch==1.6.0
torchvision==0.7.0

It seems like torch versions are not in range indeed. Does it mean I have to downgrade the torch?

Thanks!

winklemad commented 4 years ago

same issue , anyone found the solution ??

lucabergamini commented 4 years ago

Hi @nosound2 and @winklemad , a bit of context about why we left out PyTorch 1.6 from requirements: As the competition was approaching quickly we didn't want to include a potential breaking change in dependencies and we didn't have enough time to test it thoroughly.

That being said, this is probably a windows related issue (there is a section in the readme). Your best chance is probably to:

You can also do the same by cloning and installing in editable mode if you want to have a local reactive copy

nosound2 commented 4 years ago

Thanks, @lucabergamini , removing the dependencies worked for me. I did the cloning thing, and yes, I am on Windows.

I needed to do one more thing. from l5kit.dataset import EgoDataset, AgentDataset failed for me on line multiprocessing.set_start_method("fork", force=True) # this fix loop in python 3.8 on MacOS apparently because there is no fork on Windows.

I removed that line from the source file and at least some functionality seems to run now.

lucabergamini commented 4 years ago

multiprocessing.set_start_method("fork", force=True) # this fix loop in python 3.8 on MacOS

Thanks for letting me know, I'll put a platform check there!

deepakrajpurushothaman commented 4 years ago

@lucabergamini I still have the error for the fork/broken pipe problem. Could you have a look. I am running Windows 10, Torch 1.5.1, Python 3.7.9 l5kit 1.6 and I have also tried l5kit master& l5kit 1.5 after the fix. Think running Linux is the only solution for now.

> ---------------------------------------------------------------------------
> BrokenPipeError                           Traceback (most recent call last)
> <ipython-input-40-b7d59c605d01> in <module>
>       1 # ==== TRAIN LOOP
> ----> 2 tr_it = iter(train_dataloader)
>       3 progress_bar = tqdm(range(cfg["train_params"]["max_num_steps"]))
>       4 losses_train = []
>       5 for _ in progress_bar:
> 
> c:\python37\lib\site-packages\torch\utils\data\dataloader.py in __iter__(self)
>     277             return _SingleProcessDataLoaderIter(self)
>     278         else:
> --> 279             return _MultiProcessingDataLoaderIter(self)
>     280 
>     281     @property
> 
> c:\python37\lib\site-packages\torch\utils\data\dataloader.py in __init__(self, loader)
>     717             #     before it starts, and __del__ tries to join but will get:
>     718             #     AssertionError: can only join a started process.
> --> 719             w.start()
>     720             self._index_queues.append(index_queue)
>     721             self._workers.append(w)
> 
> c:\python37\lib\multiprocessing\process.py in start(self)
>     110                'daemonic processes are not allowed to have children'
>     111         _cleanup()
> --> 112         self._popen = self._Popen(self)
>     113         self._sentinel = self._popen.sentinel
>     114         # Avoid a refcycle if the target function holds an indirect
> 
> c:\python37\lib\multiprocessing\context.py in _Popen(process_obj)
>     221     @staticmethod
>     222     def _Popen(process_obj):
> --> 223         return _default_context.get_context().Process._Popen(process_obj)
>     224 
>     225 class DefaultContext(BaseContext):
> 
> c:\python37\lib\multiprocessing\context.py in _Popen(process_obj)
>     320         def _Popen(process_obj):
>     321             from .popen_spawn_win32 import Popen
> --> 322             return Popen(process_obj)
>     323 
>     324     class SpawnContext(BaseContext):
> 
> c:\python37\lib\multiprocessing\popen_spawn_win32.py in __init__(self, process_obj)
>      87             try:
>      88                 reduction.dump(prep_data, to_child)
> ---> 89                 reduction.dump(process_obj, to_child)
>      90             finally:
>      91                 set_spawning_popen(None)
> 
> c:\python37\lib\multiprocessing\reduction.py in dump(obj, file, protocol)
>      58 def dump(obj, file, protocol=None):
>      59     '''Replacement for pickle.dump() using ForkingPickler.'''
> ---> 60     ForkingPickler(file, protocol).dump(obj)
>      61 
>      62 #
> 
> BrokenPipeError: [Errno 32] Broken pipe
deepakrajpurushothaman commented 4 years ago

@lucabergamini I still have the error for the fork/broken pipe problem. Could you have a look. I am running Windows 10, Torch 1.5.1, Python 3.7.9 l5kit 1.6 and I have also tried l5kit master& l5kit 1.5 after the fix. Think running Linux is the only solution for now.

> ---------------------------------------------------------------------------
> BrokenPipeError                           Traceback (most recent call last)
> <ipython-input-40-b7d59c605d01> in <module>
>       1 # ==== TRAIN LOOP
> ----> 2 tr_it = iter(train_dataloader)
>       3 progress_bar = tqdm(range(cfg["train_params"]["max_num_steps"]))
>       4 losses_train = []
>       5 for _ in progress_bar:
> 
> c:\python37\lib\site-packages\torch\utils\data\dataloader.py in __iter__(self)
>     277             return _SingleProcessDataLoaderIter(self)
>     278         else:
> --> 279             return _MultiProcessingDataLoaderIter(self)
>     280 
>     281     @property
> 
> c:\python37\lib\site-packages\torch\utils\data\dataloader.py in __init__(self, loader)
>     717             #     before it starts, and __del__ tries to join but will get:
>     718             #     AssertionError: can only join a started process.
> --> 719             w.start()
>     720             self._index_queues.append(index_queue)
>     721             self._workers.append(w)
> 
> c:\python37\lib\multiprocessing\process.py in start(self)
>     110                'daemonic processes are not allowed to have children'
>     111         _cleanup()
> --> 112         self._popen = self._Popen(self)
>     113         self._sentinel = self._popen.sentinel
>     114         # Avoid a refcycle if the target function holds an indirect
> 
> c:\python37\lib\multiprocessing\context.py in _Popen(process_obj)
>     221     @staticmethod
>     222     def _Popen(process_obj):
> --> 223         return _default_context.get_context().Process._Popen(process_obj)
>     224 
>     225 class DefaultContext(BaseContext):
> 
> c:\python37\lib\multiprocessing\context.py in _Popen(process_obj)
>     320         def _Popen(process_obj):
>     321             from .popen_spawn_win32 import Popen
> --> 322             return Popen(process_obj)
>     323 
>     324     class SpawnContext(BaseContext):
> 
> c:\python37\lib\multiprocessing\popen_spawn_win32.py in __init__(self, process_obj)
>      87             try:
>      88                 reduction.dump(prep_data, to_child)
> ---> 89                 reduction.dump(process_obj, to_child)
>      90             finally:
>      91                 set_spawning_popen(None)
> 
> c:\python37\lib\multiprocessing\reduction.py in dump(obj, file, protocol)
>      58 def dump(obj, file, protocol=None):
>      59     '''Replacement for pickle.dump() using ForkingPickler.'''
> ---> 60     ForkingPickler(file, protocol).dump(obj)
>      61 
>      62 #
> 
> BrokenPipeError: [Errno 32] Broken pipe

@lucabergamini I fixed this with guideline in python documentation. Ref: https://docs.python.org/2/library/multiprocessing.html#multiprocessing-programming

nosound2 commented 4 years ago

@deepakrajpurushothaman , this is a different issue, you need to set num_workers=0. Windows does not support multi-processing in torch dataloader.