hubbs5 / or-gym

Environments for OR and RL Research
MIT License
373 stars 93 forks source link

Update vehicle_routing.py #10

Closed ashwin-M-D closed 2 years ago

ashwin-M-D commented 3 years ago

The vehicle routing problem had issues with the comparison of touples and arrays which prevented it from working as expected. The Directions for movement were wrong as well. These issues have been fixed. The state output and any other environment variables have not been changed in any way whatsoever.

mawbray commented 3 years ago

Hi @hubbs5 @ashwin-M-D, I am playing around with the vehicle routing problem (on the master branch). Looking at the rewards, they are distributed as either 0 or -0.1 (generated from 100 episodes of a universally random policy). This indicates to me that there may still be a problem with the allocation/environment construction. Please could you look into this? If it is a problem I am happy to look into it further and try to remedy, if not please could you clarify why this distribution of rewards could be observed (I have run the test multiple times). Thanks for your help in advance :)

ashwin-M-D commented 3 years ago

Hello @mawbray The reward distribution is actually fine to train a reinforcement learning model. However, the values can be changed by passing the required values through env_config. The rewards distribution required for effective learning may change with different models and you can change them as required.

mawbray commented 3 years ago

Hi @ashwin-M-D, okay, thank you for the clarification. All the best :)

osarwar commented 2 years ago

@mawbray @ashwin-M-D Thanks for your help guys!