issues
search
DLR-RM
/
stable-baselines3
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
https://stable-baselines3.readthedocs.io
MIT License
8.85k
stars
1.68k
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
SAC model not properly saved
#1916
PabloVD
closed
4 months ago
5
ValueError: could not broadcast input array from shape (23,) into shape (27,)
#1915
n-kish
closed
4 months ago
2
Handing mission space in Babyai env
#1914
Chainesh
closed
5 months ago
6
Hotfix: revert loading with `weights_only=True`
#1913
araffin
closed
5 months ago
0
[Bug]: evaluate_policy called multiple times vor vectorized environments
#1912
LukasFehring
opened
5 months ago
5
[Bug]: Load Trained Policy
#1911
zlw21gxy
closed
5 months ago
8
Fix tensorboad video slow numpy->torch conversion
#1910
NickLucche
closed
5 months ago
0
Discrepancy between Observations Sampled from Gym Env and Replay Buffer
#1909
AOAA96
closed
5 months ago
3
Fix memory leak in base_class.py
#1908
peteole
closed
4 months ago
7
Scaling Environment
#1907
Hamza-101
closed
4 months ago
6
[Bug]: Scaling Environment
#1906
Hamza-101
closed
5 months ago
9
Scalability
#1905
Hamza-101
closed
5 months ago
2
Adding ER-MRL to community project
#1904
corentinlger
closed
5 months ago
1
[Question] How to avoid SAC to stuck in local minima
#1903
JaimeParker
closed
5 months ago
1
Weights only param
#1902
markscsmith
opened
5 months ago
1
Cast learning_rate to float lambda for pickle safety when doing model.load
#1901
markscsmith
closed
5 months ago
1
[Bug]: if learning_rate function uses special types, they can cause torch.load to fail when weights_only=True
#1900
markscsmith
closed
5 months ago
4
Parameterize weights_only during load to allow loading of unusual models
#1899
markscsmith
closed
5 months ago
2
[Question] Discontinuous reward training curve
#1898
JaimeParker
closed
5 months ago
4
[Question] policy gradient loss and explained variance very small (almost zero) from the training start?
#1897
Ahmed-Radwan094
closed
5 months ago
2
[Feature Request] Enable predict to take tensor as input
#1896
llewynS
closed
5 months ago
3
Off policy algorithm policy_kwargs
#1895
suargi
closed
5 months ago
2
[Bug]: Potential Bug in PPO? Clarification requested
#1894
azrael417
closed
5 months ago
2
[Question] CheckpointCallback keep last K
#1893
NickLucche
closed
5 months ago
2
Issue(HER with in SAC algorithm)
#1892
wadeKeith
closed
5 months ago
2
[Question] Saving PPO rollout buffer on GPU
#1891
Ahmed-Radwan094
closed
5 months ago
2
[Bug]: EOFError after running for some steps
#1890
GeorgeWuzy
closed
5 months ago
1
[Question] How to pass a varying gamma to DQN or PPO during training?
#1889
rariss
closed
1 month ago
6
Why does the Logger only return the train/ metrics, and not eval/, time/, and rollout/?
#1888
liamquantrill
closed
5 months ago
1
[Question] Discretize continuous actions/observations ?
#1887
nrigol
closed
5 months ago
1
Training of PPO freezes after number of iterations
#1886
Ahmed-Radwan094
closed
5 months ago
8
[Question] influence of buffer size when using vecenv and save customized replay buffer
#1885
JaimeParker
closed
5 months ago
2
Fixed broken link in ppo.rst
#1884
chaitanyabisht
closed
5 months ago
0
Why does VecFrameStack clear the prior frames in the stack for the step when "terminated=True"?
#1883
wkwan
closed
5 months ago
2
Fix typo in changelog
#1882
araffin
closed
6 months ago
0
How to elegantly modify an algorithm by adding new architectures trained with custom losses?
#1881
jamesheald
closed
6 months ago
2
[Question] [Multiprocessing] RolloutBuffer groups environment transitions on a per-environment basis.
#1880
N00bcak
closed
6 months ago
1
Release v2.3.0
#1879
araffin
closed
6 months ago
0
How does stable-baselines work with a multi-agent pettingzoo environment?
#1878
AnastasiaPsarou
closed
6 months ago
1
[Feature Request] Resume trained model with set_parameters without reset_num_timesteps
#1877
tanielsfranklin
closed
6 months ago
4
[Question] Action masking for a DQN Agent
#1876
Tim1605
closed
6 months ago
1
[Question] Changes in observations
#1875
d505
closed
6 months ago
1
[Question] Training PPO model with single step episodes
#1874
oshadajay
closed
6 months ago
7
Exporting MultiInputActorCriticPolicy as ONNX
#1873
MaximCamilleri
opened
6 months ago
6
[Question] Control PPO training
#1872
mwalidcharrwi
closed
6 months ago
0
[Question] How can I wrap a non-image observation trained model via an image observation wrapper?
#1871
zichunxx
closed
6 months ago
5
Log success rate for on policy algorithms
#1870
corentinlger
closed
6 months ago
4
[Question] SubprocVecEnv doesn't work with registered custom environments
#1869
marcusfechner
closed
6 months ago
3
[Feature Request] Allow Gymnasium Composite Spaces
#1868
flowerthrower
closed
6 months ago
2
[Bug]: `rollout/success_rate` does not show for Monitor + OnPolicyAlgorithm
#1867
N00bcak
closed
6 months ago
1
Previous
Next