issues
search
DLR-RM
/
stable-baselines3
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
https://stable-baselines3.readthedocs.io
MIT License
8.35k
stars
1.6k
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Hotfix: revert loading with `weights_only=True`
#1913
araffin
closed
2 months ago
0
[Bug]: evaluate_policy called multiple times vor vectorized environments
#1912
LukasFehring
opened
2 months ago
5
[Bug]: Load Trained Policy
#1911
zlw21gxy
closed
2 months ago
8
Fix tensorboad video slow numpy->torch conversion
#1910
NickLucche
closed
2 months ago
0
Discrepancy between Observations Sampled from Gym Env and Replay Buffer
#1909
AOAA96
closed
2 months ago
3
Fix memory leak in base_class.py
#1908
peteole
closed
1 month ago
7
Scaling Environment
#1907
Hamza-101
closed
1 month ago
6
[Bug]: Scaling Environment
#1906
Hamza-101
closed
2 months ago
9
Scalability
#1905
Hamza-101
closed
2 months ago
2
Adding ER-MRL to community project
#1904
corentinlger
closed
2 months ago
1
[Question] How to avoid SAC to stuck in local minima
#1903
JaimeParker
closed
2 months ago
1
Weights only param
#1902
markscsmith
opened
2 months ago
1
Cast learning_rate to float lambda for pickle safety when doing model.load
#1901
markscsmith
closed
2 months ago
1
[Bug]: if learning_rate function uses special types, they can cause torch.load to fail when weights_only=True
#1900
markscsmith
closed
2 months ago
4
Parameterize weights_only during load to allow loading of unusual models
#1899
markscsmith
closed
2 months ago
2
[Question] Discontinuous reward training curve
#1898
JaimeParker
closed
2 months ago
4
[Question] policy gradient loss and explained variance very small (almost zero) from the training start?
#1897
Ahmed-Radwan094
closed
2 months ago
2
[Feature Request] Enable predict to take tensor as input
#1896
llewynS
closed
2 months ago
3
Off policy algorithm policy_kwargs
#1895
suargi
closed
2 months ago
2
[Bug]: Potential Bug in PPO? Clarification requested
#1894
azrael417
closed
2 months ago
2
[Question] CheckpointCallback keep last K
#1893
NickLucche
closed
2 months ago
2
Issue(HER with in SAC algorithm)
#1892
wadeKeith
closed
2 months ago
2
[Question] Saving PPO rollout buffer on GPU
#1891
Ahmed-Radwan094
closed
2 months ago
2
[Bug]: EOFError after running for some steps
#1890
GeorgeWuzy
closed
2 months ago
1
[Question] How to pass a varying gamma to DQN or PPO during training?
#1889
rariss
opened
2 months ago
6
Why does the Logger only return the train/ metrics, and not eval/, time/, and rollout/?
#1888
liamquantrill
closed
2 months ago
1
[Question] Discretize continuous actions/observations ?
#1887
nrigol
closed
2 months ago
1
Training of PPO freezes after number of iterations
#1886
Ahmed-Radwan094
closed
2 months ago
8
[Question] influence of buffer size when using vecenv and save customized replay buffer
#1885
JaimeParker
closed
2 months ago
2
Fixed broken link in ppo.rst
#1884
chaitanyabisht
closed
2 months ago
0
Why does VecFrameStack clear the prior frames in the stack for the step when "terminated=True"?
#1883
wkwan
closed
2 months ago
2
Fix typo in changelog
#1882
araffin
closed
3 months ago
0
How to elegantly modify an algorithm by adding new architectures trained with custom losses?
#1881
jamesheald
closed
3 months ago
2
[Question] [Multiprocessing] RolloutBuffer groups environment transitions on a per-environment basis.
#1880
N00bcak
closed
3 months ago
1
Release v2.3.0
#1879
araffin
closed
3 months ago
0
How does stable-baselines work with a multi-agent pettingzoo environment?
#1878
AnastasiaPsarou
closed
3 months ago
1
[Feature Request] Resume trained model with set_parameters without reset_num_timesteps
#1877
tanielsfranklin
closed
3 months ago
4
[Question] Action masking for a DQN Agent
#1876
Tim1605
closed
3 months ago
1
[Question] Changes in observations
#1875
d505
closed
3 months ago
1
[Question] Training PPO model with single step episodes
#1874
oshadajay
closed
3 months ago
7
Exporting MultiInputActorCriticPolicy as ONNX
#1873
MaximCamilleri
opened
3 months ago
5
[Question] Control PPO training
#1872
mwalidcharrwi
closed
3 months ago
0
[Question] How can I wrap a non-image observation trained model via an image observation wrapper?
#1871
zichunxx
closed
3 months ago
5
Log success rate for on policy algorithms
#1870
corentinlger
closed
3 months ago
4
[Question] SubprocVecEnv doesn't work with registered custom environments
#1869
marcusfechner
closed
3 months ago
3
[Feature Request] Allow Gymnasium Composite Spaces
#1868
flowerthrower
closed
3 months ago
2
[Bug]: `rollout/success_rate` does not show for Monitor + OnPolicyAlgorithm
#1867
N00bcak
closed
3 months ago
1
Update ruff and documentation for hf sb3
#1866
araffin
closed
3 months ago
0
[Bug]: unsupported operand for +: 'float' and 'NoneType' during PPO Training with Custom DSSAT Gym Wrapper
#1865
louisreberga
closed
3 months ago
1
[Bug]: PPO handle TimeLimit.truncated incorrectly
#1864
yinzikang
closed
4 months ago
2
Previous
Next