Closed DaniilKardava closed 2 years ago
Yes, you can absolutely reward the agent "later"! One of the core advantages (or problems being tackled by) RL is reward assignment, where the algorithm tries to figure out which actions were responsible for the good reward :)
And yes, in future, please ask on SB3 issues. Also, for these type of questions (not issues or enchantments), try asking on other forums such as the RL Discord.
Sorry if this is a silly question, when I call step and pass a certain action, the reward calculated during that step is associated with the current action and observation right? So if I need to reward the agent for a successful decision, I would need to look forward and reward it immediately rather than reward it later and associate that reward with the current action taking place? Thank you. Edit: would it be better for me to post future question under sb3? i am currently using an older version because i wasn't able to use lstm.