Open duxtinto opened 1 year ago
Hi @vojtamolda !
While debugging the code for https://github.com/vojtamolda/reinforcement-learning-an-introduction/blob/main/chapter04/exercise04-07.ipynb, I got into this scenario.
On the transitions method of the JacksCarRental class:
(1) the value of transfer was -5 (2) the transfer_cost was 5*2=10 (3) the transferred value was -3 (as self.max_cars - state[0] was 3)
it looks weird to me that the transfer_cost is for the 5 cars, but in fact we only moved 3.
shouldn't the trasnfer_cost be calculated only for the transferred cars?
Thank you very much for your help.
Best regards,
David.
Thanks for opening the issue. I don't have time to work on this right now, but I'll look into it eventually.
Hi @vojtamolda !
While debugging the code for https://github.com/vojtamolda/reinforcement-learning-an-introduction/blob/main/chapter04/exercise04-07.ipynb, I got into this scenario.
On the transitions method of the JacksCarRental class:
(1) the value of transfer was -5 (2) the transfer_cost was 5*2=10 (3) the transferred value was -3 (as self.max_cars - state[0] was 3)
it looks weird to me that the transfer_cost is for the 5 cars, but in fact we only moved 3.
shouldn't the trasnfer_cost be calculated only for the transferred cars?
Thank you very much for your help.
Best regards,
David.