takuseno / d3rlpy

An offline deep reinforcement learning library
https://takuseno.github.io/d3rlpy
MIT License
1.29k stars 230 forks source link

[QUESTION/FEATURE] Track conservative loss contributions in CQL #257

Closed joshuaspear closed 4 months ago

joshuaspear commented 1 year ago

Is your feature request related to a problem? Please describe. I'm not sure if I am missing something but I would like the ability to easily track different contributions to a given loss - I can't seem to find an easy way to implement this

Describe the solution you'd like I would like the ability to track the individual contributions to the loss function for example, track the MSE of the Bellman error and regularisation separately for CQL on the validation set

paulhilders commented 1 year ago

Did you manage to find a workaround solution for this? Tracking the regularisation term separately would be very useful for the detection of underfitting during training.

joshuaspear commented 1 year ago

Exactly the reason I wanted it as well! :) I haven't yet sorry - I may decide to use an RL library however, will comment on here if I do end up implementing

paulhilders commented 1 year ago

That would be great! Which RL library were you planning to use, if I may ask?

joshuaspear commented 1 year ago

Looking at RLLIB in RAY (https://github.com/ray-project/ray/tree/master/rllib). Seems pretty flexible once you've grappled with the abstractions. The distributed aspect isn't relevant for me now but thought I may as well learn an industry framework given the direction of the field! More than open to other suggestions though :)

takuseno commented 1 year ago

Sorry for the inconvenience. Currently, it's not yet supported. But, in the short term, you can simply hack d3rlpy to track metrics. You can modify these parts to include regularization loss in metrics:

It will break typing checks, but should work.

joshuaspear commented 1 year ago

Hey @takuseno, I'm trying to submit a pull request to add this tracking for the DiscreteCQL model (I will aim to add it for the continuous CQL model later) however, I can't seem to push to a new branch on the git repo in order to open the pull request. Please could I have some guidance.

I have also run the required tests and have a couple of questions:

  1. A large number of the tests failed however, only 1 of which seems to be related to the changes that I have made. I have eddited the d3rlpy/algos/cql.py and d3rlpy/algos/torch/cql_impl.py scripts. All of the DiscreteCQL tests have passed, except for 1 of the performance tests. I noticed that a number of the other performance tests for other algorithms also fail - is this expected?
  2. When running the formatting and linting, the processes raised issues with a number of files which I have not edditted, I will not include these changes in the PR. I just wanted to confirm that was okay, please?

Thanks!

joshuaspear commented 1 year ago

Looking at RLLIB in RAY (https://github.com/ray-project/ray/tree/master/rllib). Seems pretty flexible once you've grappled with the abstractions. The distributed aspect isn't relevant for me now but thought I may as well learn an industry framework given the direction of the field! More than open to other suggestions though :)

FYI - rllib is an absolute nightmare unless you're doing proper distributed stuff - so many overheads! Back to d3rlpy!

takuseno commented 1 year ago

@joshuaspear Hi, thanks for trying to make a PR!

test

For the tests, you shouldn't run performance test because it's not really maintained anymore. Please simply run this:

./scripts/test

formatting

Recently, formatting libraries have an incompatible update. So please install those by specifying versions like this:

pip install black==22.3.0 mypy==0.942 pylint==2.13.5 yapf isort

And, then run:

./scripts/format
./scripts/lint
joshuaspear commented 1 year ago

Great - thank you, I will rerun the tests. Please can I confirm how I should push the code to Github? Under a new branch?

joshuaspear commented 1 year ago

Hi @takuseno, I have run all the tests and everything has passed (FYI - I had to update the Pendulum environment from v0 to v1) however, I won't push this change. Please let me know the next steps to integrate the PR. Cheers

takuseno commented 1 year ago

Sounds good. Thanks for the change! You can't directly push your commits to this repository. Here are the steps:

  1. fork this repository in GitHub
  2. clone your fork repository in your machiine
  3. make changes, commits them, and push the branch
  4. make a new pull request in GitHub

This document might be helpful. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request-from-a-fork

paulhilders commented 1 year ago

@joshuaspear It seems very useful! Many thanks for picking this up!

takuseno commented 4 months ago

Please let me close this since I believe that this issue is already resolved in the latest release.