Closed tsung-jui-wu closed 2 years ago
Thank you so much for your attention! The "xxx_parallel_act" function get input of sequence $[a_0, a1, ..., a{n-1}]$ and predict $[a_1, a_2, ..., a_n]$ simultaneously in one feedforward process, which is used during training, aka. teaching force. The "xxx_autoregressive_act" is used for the sampling process that interacts with the environment, without actions as inputs. Thus it will infer the first action $a_1$ with a start signal $a_0$, then insert the $a_1$ back into the input and infer the $a_2$ with $[a_0, a_1]$, and so on till $a_n$ is inferred, aka. auto-regressive.
Thanks for the explanation!
Hi, thanks for the great work of this project.
This may seem like a simple question but I can't wrap my head around it. Could you explain what is the difference between the discrete_parallel_act and discrete_autoregressive_act?