kncrane / CubeTrack

Unity ML-Agents Environment for Active Object Tracking with Reinforcement Learning
MIT License
12 stars 3 forks source link

Why use discrete instead of continuous action space? #2

Closed chenzhutian closed 2 years ago

chenzhutian commented 2 years ago

Nice work! I read your article and wonder why you decide to use discrete action space? Have you tried continuous action space? Thanks!

kncrane commented 2 years ago

Hello, thank you =)

No there wasn't a particular argument for using discrete over continuous, I am now looking at the same task of active tracking but in a slightly more realistic/complex environment, and in that scenario I am using a continuous action space (two actions, one throttle and one steering, both with the range [-1,1]) and Soft Actor Critic, in prep for potentially transferring to a real vehicle.

Just debugging at the moment because although the agent has demonstrated some really nice tracking behaviour it is completely unstable, getting huge fluctuations in cumulative episodic reward throughout training run. Been playing with hyperparameters and other design choices for months then realised it's more likely a bug as I trialled a 'cheat agent' (feeding it the heading vector as a vector observation rather than an image obs) and I got the same thing