Closed chenzhutian closed 2 years ago
Hello, thank you =)
No there wasn't a particular argument for using discrete over continuous, I am now looking at the same task of active tracking but in a slightly more realistic/complex environment, and in that scenario I am using a continuous action space (two actions, one throttle and one steering, both with the range [-1,1]) and Soft Actor Critic, in prep for potentially transferring to a real vehicle.
Just debugging at the moment because although the agent has demonstrated some really nice tracking behaviour it is completely unstable, getting huge fluctuations in cumulative episodic reward throughout training run. Been playing with hyperparameters and other design choices for months then realised it's more likely a bug as I trialled a 'cheat agent' (feeding it the heading vector as a vector observation rather than an image obs) and I got the same thing
Nice work! I read your article and wonder why you decide to use discrete action space? Have you tried continuous action space? Thanks!