When using the rllib training example, the scenario is not added to the ScanningTimeReward until GeneralSatelliteTasking calls link. Also added the ppo train call to the example notebook. The reward is also a dict not a float to take the average, which seem like what was intended, but dose not matter with one satellite.
Closes #147
Type of change
[ ] Bug fix (non-breaking change which fixes an issue)
How should this pull request be reviewed?
[ ] By commit
How Has This Been Tested?
Just running the example, with
Please describe the tests that you ran to verify your changes.
ScanningTimeReward and scenario init order
When using the rllib training example, the scenario is not added to the ScanningTimeReward until GeneralSatelliteTasking calls link. Also added the ppo train call to the example notebook. The reward is also a dict not a float to take the average, which seem like what was intended, but dose not matter with one satellite.
Closes #147
Type of change
How should this pull request be reviewed?
How Has This Been Tested?
Just running the example, with
Please describe the tests that you ran to verify your changes.
Passes Tests
Have not got the unit tests working in my env