umautobots / bidirection-trajectory-predicter

The code for Bi-directional Trajectory Prediction (BiTraP).
Other
78 stars 23 forks source link

Multiple issues & their resolutions #2

Closed ksachdeva closed 3 years ago

ksachdeva commented 3 years ago

Hi @MoonBlvd

Unfortunately, at present, there are multiple issues to run your experiment. I am here providing a list of issues and how I have resolved them locally.

Issue 1

Your codebase relies on trajectron-plus-plus repo and has a hardcoded path in this repo https://github.com/umautobots/bidireaction-trajectory-prediction/blob/d85c2348e2fdc976a2af50fdef7aeb06e1383bae/datasets/ETH_UCY.py#L9

Should update the README to mention that you should clone trajectron. The best approach would be that you create a gitsubmodule so that trajectron-plus-plus is fetched with your repo

Issue 2

In train.py you have imports like datasets.build, libs.build. Here - https://github.com/umautobots/bidireaction-trajectory-prediction/blob/d85c2348e2fdc976a2af50fdef7aeb06e1383bae/tools/train.py#L16

You could simply add the root dir of this repo in PYTHONPATH and would not need '.build' etc. Also, libs should bitrap module anyway.

Issue 3

ETH configuration does not have cfg.SOLVER.TRAIN_MODULE. In any case, even if it had then build_optimizer does not take 3rd parameter so it would have failed anyway.

https://github.com/umautobots/bidireaction-trajectory-prediction/blob/d85c2348e2fdc976a2af50fdef7aeb06e1383bae/tools/train.py#L59

I simply removed the 3rd parameter from this function call.

Issue 4

Running on CPU. It seems that you have done a good job at configuring the device usage. Most PyTorch developers do not take care of this and always assume GPU.

You should add in the README that if one wants to use CPU then they can change it in the config (yaml) file

Issue 5

Multiprocessing, DataLoader & Pickle crashed on my computer with a strange error. I googled and see that many people have faced it. Here is the error -

File "/Users/ksachdeva/Desktop/Dev/exp/bidireaction-trajectory-prediction/env/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 294, in _get_iterator
    return _MultiProcessingDataLoaderIter(self)
  File "/Users/ksachdeva/Desktop/Dev/exp/bidireaction-trajectory-prediction/env/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 801, in __init__
    w.start()
  File "/Users/ksachdeva/.pyenv/versions/3.8.6/lib/python3.8/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
  File "/Users/ksachdeva/.pyenv/versions/3.8.6/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "/Users/ksachdeva/.pyenv/versions/3.8.6/lib/python3.8/multiprocessing/context.py", line 284, in _Popen
    return Popen(process_obj)
  File "/Users/ksachdeva/.pyenv/versions/3.8.6/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/Users/ksachdeva/.pyenv/versions/3.8.6/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/Users/ksachdeva/.pyenv/versions/3.8.6/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 47, in _launch
    reduction.dump(process_obj, fp)
  File "/Users/ksachdeva/.pyenv/versions/3.8.6/lib/python3.8/multiprocessing/reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <function augment at 0x157e8d550>: attribute lookup augment on __main__ failed

To resolve this I have changed the NUM_WORKERS configuration to 0 (which is the default in torch).

I am not sure if this is a bug trigger by how you wrote the loop or torch or python on macOS. FYI - many people are facing this in other contexts as well.

Issue 6

https://github.com/umautobots/bidireaction-trajectory-prediction/blob/d85c2348e2fdc976a2af50fdef7aeb06e1383bae/tools/train.py#L129

EarlyStopping is not imported and hence python was unhappy. In any case, it is not used so I commented this line.

Issue 7

This is the issue where I am stuck and believe that you may have modified trajectron-plus-plus and those changes are not in their repo.

Here is the error -

File "/Users/ksachdeva/Desktop/Dev/exp/bidireaction-trajectory-prediction/env/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/Users/ksachdeva/Desktop/Dev/exp/bidireaction-trajectory-prediction/env/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/Users/ksachdeva/Desktop/Dev/exp/bidireaction-trajectory-prediction/datasets/ETH_UCY.py", line 238, in __getitem__
    first_history_index, x_t, y_t, x_st_t, y_st_t, neighbors_data, neighbors_data_st, neighbors_lower_upper, neighbors_future, \
ValueError: not enough values to unpack (expected 14, got 9)

As you can see that your code is expecting 14 values to be returned but NodeTypeDataset is returning 9 values.

I then checked in trajectron-plus-plus and indeed it returns 9 value. Here is the link to the place in trajectron-plus-plus

https://github.com/StanfordASL/Trajectron-plus-plus/blob/f8a11fd90a1a4cb64c70ece0d4cf4a92dfc4e7f7/trajectron/model/dataset/preprocessing.py#L190

May be you have locally modified the trajectron-plus-plus repo.

Please suggest

Regards Kapil

MoonBlvd commented 3 years ago

Hi @ksachdeva we appreciate your feedback. The code runs just fine on our machines so I think you are right that there are some dependencies and hardcoded paths that we need to fix. We also modified based on trajectorn++ code so that pulling their code wouldn't work immediately. We are super busy recently and we will work on it ASAP.

ksachdeva commented 3 years ago

Thanks @MoonBlvd,

I resolved the 14 vs 9 issue (Issue number 7). It seems that those additional parameters are not used by the rest of your code.

After that there were few more issues. I am listing them here so that you are aware of them and/or if someone tries the repo.

Issue 8

The usage of cave_loss function here https://github.com/umautobots/bidireaction-trajectory-prediction/blob/d85c2348e2fdc976a2af50fdef7aeb06e1383bae/bitrap/modeling/bitrap_np.py#L226

is not correct as you are passing arguments that this function does not even take.

Issue 9

https://github.com/umautobots/bidireaction-trajectory-prediction/blob/d85c2348e2fdc976a2af50fdef7aeb06e1383bae/bitrap/modeling/bitrap_np.py#L233

mutual_info_p, self.cfg.PSEUDO_NLL are not defined.

Since for bitrap_np the latent distribution is gaussian, I have commented these lines.

After resolving issue 9 the training started!

If you want I can submit these changes as pull requests.

Regards Kapil

MoonBlvd commented 3 years ago

Hi @ksachdeva the self.cfg.PSEUDO_NLL is not used. We tried to clean this repo before publishing it but It seems that it still contains some codes that we did not use for the final paper. Again thank you for pointing out these issues, we will work on them asap.

MoonBlvd commented 3 years ago

Hi @ksachdeva we have solved the issues you posted and updated this repo. We have ignored your issue 5 as we do not see it happens on any of our machines... Please feel free to close this issue, thank you!

sanketh1691 commented 2 years ago

Hi @ksachdeva , I am trying to extract the pedestrian trajectory as asked in the README of this repo in the form of pickle files from JAAD and PIE original repos but unable to find proper functions that do the required operations, which will generate the pedestrian trajectory data. Can you help/guide me on how to extract this data?

Thank you, Sanketh