Fix TD3 DDPG Implementation: Move Sampling Inside Gradient Step LoopFix implementation mismatch

Toni-SM / skrl

Modular reinforcement learning library (on PyTorch and JAX) with support for NVIDIA Isaac Gym, Omniverse Isaac Gym and Isaac Lab

https://skrl.readthedocs.io/

MIT License

443 stars 43 forks source link

Fix TD3 DDPG Implementation: Move Sampling Inside Gradient Step LoopFix implementation mismatch #147

Closed alessandroassirelli98 closed 2 months ago

alessandroassirelli98 commented 2 months ago

This pull request addresses a discrepancy between the original TD3 and DDPG paper's algorithm and the current implementation in the repository. Specifically, the original implementation performs the sampling step outside of the gradient step loop, which diverges from the methodology outlined in the paper. We have corrected this by moving the sampling process inside the gradient step loop, aligning the implementation more closely with the intended algorithmic procedure described in the original paper and SpinningUp description.

Toni-SM commented 2 months ago

Hi @alessandroassirelli98

Great!

Please, also update:

docs
- https://github.com/Toni-SM/skrl/blob/631613a8d8950b0a6cf3efa44d96cf6349565caa/docs/source/api/agents/ddpg.rst?plain=1#L50-L51
- https://github.com/Toni-SM/skrl/blob/631613a8d8950b0a6cf3efa44d96cf6349565caa/docs/source/api/agents/td3.rst?plain=1#L50-L51

CHANGELOG under the following version (create it if it doesn't exist)

## [1.2.0] - Unreleased
### Fixed
- YOUR_FIX_IMPLEMENTATION_SHORT_DESCRIPTION

alessandroassirelli98 commented 2 months ago

Hi @Toni-SM ! I see the current version is 1.1.0. Should I just add a comment on that version in the changelog, like so? Screenshot from 2024-04-16 17-29-56

Toni-SM commented 2 months ago

I see the current version is 1.1.0.

Version 1.1.0 is already released (see: https://github.com/Toni-SM/skrl/releases)

New additions/changes/fixes (like the one you proposed) will be released under version 1.2.0. Please, use this for CHANGELOG:

## [1.2.0] - Unreleased
### Fixed
- YOUR_FIX_IMPLEMENTATION_SHORT_DESCRIPTION