Open xzxzxzxz opened 3 years ago
Hi @xzxzxzxz, I'm not exactly sure why it's broken. If you can paste the log here, maybe I can help you diagnose. The code in plot.py is mostly a few lines of regular expression. If the output somehow changes, you should change the expression accordingly. Some online interactive tools help you debug, e.g., https://regexr.com/.
I remember the validation performance oscillated during training, but I don't remember whether it was this much since it's been a while.
I used to train the policy with CPU instead of GPU since most of the time was spent on the simulation & lookforward calculation. I remember GPU didn't bring much speed gain. As for the training time, it was about half day to one day. There are some optimizations you could to speed the compucation up, however, at this poin, I'm not able to spend more time on this project. If you could spend some time optimizing the code, I believe you will also gain deeper understanding of it.
Thank you Changan for your reply! It seems like the log file is too large and broken, but it does not matter that much. The output performance is good and stable. I am interested in working on the code base you created on multi agent planning, and I hope we can keep in touch and discuss more. I wanted to note a minor problem within the human motion prediction network: in the paper the network structure is (64, 5), but in the implementation is was (64, 5) (self.model_predictive_rl.motion_predictor_dims = [64, 5]
).
Hi @xzxzxzxz sorry, I'm not getting your question. Do you mean the structure in the paper does not match the one in the code?
Yes, the network structure seems to be slightly different, but this shall be fine.
Hi Changan, thanks for the great work! I finished training, but I cannot print the log figure with command
python utils/plot.py data/output/output.log
I looked into the output.log file, and found it was broken, and some texts are missing. Do you have any idea how this could happen?Also, I looked into the VAL return with my bare eyes, and I noticed the training curve can be very noisy. The average VAL return from 1k to 10k is something like: 0.4698, 0.5208, 0.3657, 0.2763, 0.3791, ..., 0.1896, 0.5917. The terminal performance is indeed as good as the result reported in you paper, but this may suggest the model based RL is not stable. Is this also what you got?
Also, I trained the model with 2080ti gpu for 27hrs, is it supposed to be this slow? The networks look pretty small... Thank you so much for your attention!