SforAiDl / genrl

A PyTorch reinforcement learning library for generalizable and reproducible algorithm implementations with an aim to improve accessibility in RL
https://genrl.readthedocs.io
MIT License
403 stars 59 forks source link

[WIP] TOML saving and loading hyperparams #286

Closed Het-Shah closed 3 years ago

Het-Shah commented 4 years ago

Currently i have added only saving feature for the hyperparams from the get_hyperparams method. Will add loading.

249

Sharad24 commented 4 years ago

Lets wrap this up soon?

Het-Shah commented 4 years ago

I don't think we need to use getattr as well. get_hyperparams should work just fine then.

Sharad24 commented 4 years ago

Ok sounds good

Het-Shah commented 4 years ago

Review this? @Sharad24

Het-Shah commented 4 years ago

Can you add a short tutorial file on how to load

Yea sure I'll add these

Sharad24 commented 4 years ago

CI is failing?

Het-Shah commented 4 years ago

Linting is failing wait i'll do that.

codecov[bot] commented 4 years ago

Codecov Report

Merging #286 into master will decrease coverage by 0.02%. The diff coverage is 65.00%.

@@            Coverage Diff             @@
##           master     #286      +/-   ##
==========================================
- Coverage   90.06%   90.03%   -0.03%     
==========================================
  Files          88       88              
  Lines        3685     3694       +9     
==========================================
+ Hits         3319     3326       +7     
- Misses        366      368       +2     
Impacted Files Coverage Δ
genrl/agents/deep/ddpg/ddpg.py 92.30% <0.00%> (ø)
genrl/agents/deep/dqn/base.py 94.25% <0.00%> (ø)
genrl/agents/deep/sac/sac.py 93.10% <0.00%> (ø)
genrl/agents/deep/td3/td3.py 92.72% <0.00%> (ø)
genrl/agents/deep/a2c/a2c.py 93.24% <33.33%> (ø)
genrl/agents/deep/vpg/vpg.py 94.33% <33.33%> (ø)
genrl/agents/deep/base/offpolicy.py 97.40% <50.00%> (ø)
genrl/trainers/base.py 88.04% <80.95%> (-1.12%) :arrow_down:
genrl/agents/deep/base/base.py 93.54% <100.00%> (ø)
genrl/agents/deep/ppo1/ppo1.py 100.00% <100.00%> (ø)
... and 2 more
Het-Shah commented 4 years ago

Looks ready now @Sharad24

Sharad24 commented 4 years ago

Build is failing, tried re-running it -- doesnt work. Can you look into this?

Het-Shah commented 4 years ago

=================================== FAILURES =================================== _ TestCBAgent.test_linear_posterior_agent __

self = <tests.test_bandit.test_cb_agents.TestCBAgent object at 0x13d906090>

def test_linear_posterior_agent(self) -> None:
  self._test_fn(LinearPosteriorAgent)

tests/test_bandit/test_cb_agents.py:39:


tests/test_bandit/test_cb_agents.py:35: in _test_fn trainer.train(timesteps=10, update_interval=2, update_after=5, batch_size=2) genrl/trainers/bandit.py:182: in train action, kwargs.get("batch_size", 64), train_epochs


self = <genrl.agents.bandits.contextual.linpos.LinearPosteriorAgent object at 0x13d659390> action = tensor(0, dtype=torch.int32), batch_size = 2, train_epochs = 20

def update_params(
    self, action: int, batch_size: int = 512, train_epochs: Optional[int] = None
):
    """Update parameters of the agent.

    Updated the posterior over beta though bayesian regression.

    Args:
        action (int): Action to update the parameters for.
        batch_size (int, optional): Size of batch to update parameters with.
            Defaults to 512
        train_epochs (Optional[int], optional): Epochs to train neural network for.
            Not applicable in this agent. Defaults to None
    """
    self.update_count += 1

    x, y = self.db.get_data_for_action(action, batch_size)
    x = torch.cat([x, torch.ones(x.shape[0], 1)], dim=1)
    inv_cov = torch.mm(x.T, x) + self.lambda_prior * torch.eye(self.context_dim + 1)
  cov = torch.inverse(inv_cov)

E RuntimeError: inverse_cpu: U(6,6) is zero, singular U.

genrl/agents/bandits/contextual/linpos.py:148: RuntimeError ----------------------------- Captured stdout call -----------------------------

Started at 07-09-20 14:41:35 Training LinearPosteriorAgent on CovertypeDataBandit for 10 timesteps

This is the error trace i don't really know what this is?

Sharad24 commented 4 years ago

Can you sync up this branch with the master? I think this was an issue a few days back, but was resolved.

Het-Shah commented 4 years ago

Yea cool

Het-Shah commented 4 years ago

Still getting it?

Sharad24 commented 3 years ago

Fetch and Merge again, the test was removed in #329