Open wenjie-mo opened 1 year ago
Hi! Sorry you've been having trouble. Let me answer each one piece by piece. First off, that 2k number corresponds to environment stepping time (i.e. no RL algo in the loop) so during training you'll see an FPS that differs significantly from algorithm depending on the type of policy used and whether the environment calculates a per-agent FPS or an overall "amount of experience generated per second in total". As for each particular one.
In the first type, we didn't freeze our sample factory version and the newest one has an additional hparam that we didn't have in our version. This is fixed here https://github.com/facebookresearch/nocturne/pull/59 and will be merged shortly. If you run on that PR on the machine you have you should see about 10k-20k fps.
Looking into this one, this one usually means something went wrong with setting the config.
For this one, you need to increase the value of n_training_threads. The environment is running without any vectorization by default. Hope that helps
Hi Eugene, thanks so much for the reply and clarification! I will try out these solutions soon and let you know if they all works!
Hi, sorry I accidentally closed the issue. I would like to keep the issue open just for tracking purpose. Thanks!
Question
Hello I am wondering which script and hyperparameters could achieve the 2000+ step/sec training speed as mentioned in the paper. So I have tried the following:
run_sample_factory.py algorithm=APPO Problem: When using sample_factory library: parameters lr_schedule and max_entropy_coeff are missing, not sure what are the optimal numbers I should use.
run_rllib.py Problem: same run time error for every worker, attached below:
My settings: Code: newest code from main branch OS: Ubuntu 20.04 GPU: RTX 3080 with CUDA 11.6 sample_factory: I have tried latest and aed6cc92a7eb3510c4d4bcfac083ced07b5222f9 (as mentioned in paper)
Please let me know if I made anything wrong when running the scripts. Thanks so much for answering!