Open brodermind opened 1 week ago
@eleurent
You are printing the same input action twice, the one which is unscaled, in [-1, 1]. The scaled action is fed to the vehicle directly. After it has been executed, you can access it with print(self.controlled_vehicle.action)
Does the action input in def _reward(self, action: Action) -> float:
also unscaled ? I wanna compute rewards according to the actual action, i also add print(self.controlled_vehicle.action)
in the def _reward(self, action: Action) -> float:
function, but get
AttributeError: 'HighwayEnv' object has no attribute 'controlled_vehicle'
, how can i get the actual action in def _reward(self, action: Action) -> float:
?
Besides, I write print("ego acc: {}, speed: {}, acc: {}".format(self.vehicle.action, self.vehicle.speed, action[0]))
in def _reward(self, action: Action) -> float:
, get
ego acc: {'acceleration': -9.143099665641785, 'steering': -0.0170410996816317}, speed: 24.390460022290547, acc: -0.9047888517379761
ego acc: {'acceleration': -8.94046038389206, 'steering': -0.006646797551511874}, speed: 23.794429330031075, acc: -0.8822733759880066
ego acc: {'acceleration': -8.713669419288635, 'steering': 0.00408259474217243}, speed: 23.213518035411834, acc: -0.8570743799209595
ego acc: {'acceleration': -8.462130784988403, 'steering': 0.011589494497536434}, speed: 22.649375983079274, acc: -0.8291256427764893
ego acc: {'acceleration': -8.193395435810089, 'steering': 0.01602412584630364}, speed: 22.103149620691934, acc: -0.7992661595344543
ego acc: {'acceleration': -7.89635956287384, 'steering': 0.018041595207714756}, speed: 21.576725649833676, acc: -0.7662621736526489
ego acc: {'acceleration': -7.466260373592377, 'steering': 0.01743891977243528}, speed: 21.07897495826085, acc: -0.7184733748435974
ego acc: {'acceleration': -7.3168264627456665, 'steering': 0.017414108681810925}, speed: 20.59118652741114, acc: -0.7018696069717407
ego acc: {'acceleration': -7.18576192855835, 'steering': 0.020717354298106838}, speed: 20.112135732173915, acc: -0.6873068809509277
So the 'action' is unscaled, and 'self.vehicle.action' is scaled action. However, the self.vehicle.speed
seems not scaled because it seems calculate by self.vehicle.speed + action[0]
, not self.vehicle.speed + self.vehicle.action[acceleration]
?
@eleurent
Hi, everyone! I set env config as follows and trained a model:
However, when i reload the model and print the action and steering value, they are all not in the correct value range. I confused.... i write
print("acc: {}, steering: {}".format(action[0], action[1]))
indef _reward()
in highway_env.py and writeprint
in following functionI get the following result: original acc: 0.1795186996459961, steering: -0.02994769811630249 actual acc: 0.1795186996459961, steering: -0.02994769811630249 acc: 0.1795186996459961, steering: -0.02994769811630249 original acc: 0.20840740203857422, steering: -0.02228790521621704 actual acc: 0.20840740203857422, steering: -0.02228790521621704 acc: 0.20840740203857422, steering: -0.02228790521621704 original acc: 0.2234337329864502, steering: -0.014497756958007812 actual acc: 0.2234337329864502, steering: -0.014497756958007812 acc: 0.2234337329864502, steering: -0.014497756958007812 original acc: 0.2373422384262085, steering: -0.007743716239929199 actual acc: 0.2373422384262085, steering: -0.007743716239929199 acc: 0.2373422384262085, steering: -0.007743716239929199 original acc: 0.25095176696777344, steering: -0.0019524693489074707 actual acc: 0.25095176696777344, steering: -0.0019524693489074707 acc: 0.25095176696777344, steering: -0.0019524693489074707 original acc: 0.2658735513687134, steering: 0.0023392438888549805 actual acc: 0.2658735513687134, steering: 0.0023392438888549805 acc: 0.2658735513687134, steering: 0.0023392438888549805 original acc: 0.2798728942871094, steering: 0.004693746566772461 actual acc: 0.2798728942871094, steering: 0.004693746566772461 acc: 0.2798728942871094, steering: 0.004693746566772461 original acc: 0.290974497795105, steering: 0.005937099456787109 actual acc: 0.290974497795105, steering: 0.005937099456787109 acc: 0.290974497795105, steering: 0.005937099456787109 original acc: 0.3005625009536743, steering: 0.0067664384841918945 actual acc: 0.3005625009536743, steering: 0.0067664384841918945 acc: 0.3005625009536743, steering: 0.0067664384841918945
I AM SO CONFUSED .......