Closed Lucaweihs closed 2 years ago
This pull request introduces 6 alerts and fixes 2 when merging d1d535353cd7e67433d20ff541862955b696c936 into 9da8674e7781370b4c257eab707a613e953c002f - view on LGTM.com
new alerts:
fixed alerts:
This pull request fixes 2 alerts when merging 7c7c88ea93d719ce75b3646216201dcbf90b1a07 into 9da8674e7781370b4c257eab707a613e953c002f - view on LGTM.com
fixed alerts:
Thanks for the review Jordi :)! Finally got to your questions / suggestions. One thing that's missing currently is some good tutorial/docs on how to use these StageComponent
s and new storages. I think I'll need to save that for later though as writing a good tutorial always takes longer than anticipated. Let me know if you have any ideas regarding the training pipeline problem before I go ahead an merge things.
This pull request fixes 2 alerts when merging eca262da45792d411d2d4e89f76dbd0db6cd5b0a into 9da8674e7781370b4c257eab707a613e953c002f - view on LGTM.com
fixed alerts:
This pull request fixes 2 alerts when merging 2c9ab5bd92df1c7fef768259b8337d46b6e5a326 into 42407f2472ccdb2acb27e6dd2ffb586f48904970 - view on LGTM.com
fixed alerts:
Our current
RolloutStorage
class is quite limiting as it restricts us to algorithms/losses that can be computed purely onpolicy and models that operate on the single, most-recent, collection of observations. This PR generalizes theRolloutStorage
class so that users can define their own storage classes. Of course defining storage classes does little if there is no mechanism to compute losses/updates given data from these storage objects so this PR also introduces a new mechanism for defining such losses at theTrainingPipeline
level (using aStageComponent
class that can be added toPipelineStage
).