I loved using this env. It's very simple and smooth to use and understand. I was using it for practice while learning monte-carlo control..and this technique requires randomly selecting initial states in each episode and following the trajectory from that to the terminal state. I couldn't do this directly with your env because resetting it always sets the state as (0,0).
Maybe adding an optional argument in the reset method for desired start state could people some time in the future. Regardless, thank you for this gift!
Hi Matt,
I loved using this env. It's very simple and smooth to use and understand. I was using it for practice while learning monte-carlo control..and this technique requires randomly selecting initial states in each episode and following the trajectory from that to the terminal state. I couldn't do this directly with your env because resetting it always sets the state as (0,0).
Maybe adding an optional argument in the reset method for desired start state could people some time in the future. Regardless, thank you for this gift!