Training log not printed out, model based RL noisy

ChanganVR / RelationalGraphLearning

[IROS20] Relational graph learning for crowd navigation

132 stars 41 forks source link

Training log not printed out, model based RL noisy #7

Open xzxzxzxz opened 3 years ago

xzxzxzxz commented 3 years ago

Hi Changan, thanks for the great work! I finished training, but I cannot print the log figure with command python utils/plot.py data/output/output.log I looked into the output.log file, and found it was broken, and some texts are missing. Do you have any idea how this could happen?

Also, I looked into the VAL return with my bare eyes, and I noticed the training curve can be very noisy. The average VAL return from 1k to 10k is something like: 0.4698, 0.5208, 0.3657, 0.2763, 0.3791, ..., 0.1896, 0.5917. The terminal performance is indeed as good as the result reported in you paper, but this may suggest the model based RL is not stable. Is this also what you got?

Also, I trained the model with 2080ti gpu for 27hrs, is it supposed to be this slow? The networks look pretty small... Thank you so much for your attention!

ChanganVR commented 3 years ago

Hi @xzxzxzxz, I'm not exactly sure why it's broken. If you can paste the log here, maybe I can help you diagnose. The code in plot.py is mostly a few lines of regular expression. If the output somehow changes, you should change the expression accordingly. Some online interactive tools help you debug, e.g., https://regexr.com/.

I remember the validation performance oscillated during training, but I don't remember whether it was this much since it's been a while.

I used to train the policy with CPU instead of GPU since most of the time was spent on the simulation & lookforward calculation. I remember GPU didn't bring much speed gain. As for the training time, it was about half day to one day. There are some optimizations you could to speed the compucation up, however, at this poin, I'm not able to spend more time on this project. If you could spend some time optimizing the code, I believe you will also gain deeper understanding of it.

xzxzxzxz commented 3 years ago

Thank you Changan for your reply! It seems like the log file is too large and broken, but it does not matter that much. The output performance is good and stable. I am interested in working on the code base you created on multi agent planning, and I hope we can keep in touch and discuss more. I wanted to note a minor problem within the human motion prediction network: in the paper the network structure is (64, 5), but in the implementation is was (64, 5) (self.model_predictive_rl.motion_predictor_dims = [64, 5]).

ChanganVR commented 3 years ago

Hi @xzxzxzxz sorry, I'm not getting your question. Do you mean the structure in the paper does not match the one in the code?

xzxzxzxz commented 3 years ago

Yes, the network structure seems to be slightly different, but this shall be fine.