[question] Abnormal running speed of off-policy algorithms on Mujoco environments.

muchvo commented 1 year ago

[x] I have marked all applicable categories:
- [ ] exception-raising bug
- [ ] RL algorithm bug
- [ ] documentation request (i.e. "X is missing from the documentation.")
- [ ] new feature request
[x] I have visited the source website
[x] I have searched through the issue tracker for duplicates

[x] I have mentioned version numbers, operating system and environment, where applicable: About my environment: System: Ubuntu20.04 CPU：AMD EPYC 7H12 64-Core Processor GPU: [GeForce RTX 3090]*8

_libgcc_mutex             0.1                        main    defaults
_openmp_mutex             5.1                       1_gnu    defaults
absl-py                   1.4.0                    pypi_0    pypi
ca-certificates           2023.01.10           h06a4308_0    defaults
cachetools                5.3.0                    pypi_0    pypi
certifi                   2022.12.7        py38h06a4308_0    defaults
charset-normalizer        3.1.0                    pypi_0    pypi
cloudpickle               2.2.1                    pypi_0    pypi
dm-env                    1.6                      pypi_0    pypi
dm-tree                   0.1.8                    pypi_0    pypi
envpool                   0.8.2                    pypi_0    pypi
glfw                      2.5.6                    pypi_0    pypi
google-auth               2.16.2                   pypi_0    pypi
google-auth-oauthlib      0.4.6                    pypi_0    pypi
grpcio                    1.51.3                   pypi_0    pypi
gym                       0.26.2                   pypi_0    pypi
gym-notices               0.0.8                    pypi_0    pypi
gymnasium                 0.26.3                   pypi_0    pypi
gymnasium-notices         0.0.1                    pypi_0    pypi
h5py                      3.8.0                    pypi_0    pypi
idna                      3.4                      pypi_0    pypi
imageio                   2.25.0                   pypi_0    pypi
importlib-metadata        6.0.0                    pypi_0    pypi
jax-jumpy                 0.2.0                    pypi_0    pypi
ld_impl_linux-64          2.38                 h1181459_1    defaults
libffi                    3.4.2                h6a678d5_6    defaults
libgcc-ng                 11.2.0               h1234567_1    defaults
libgomp                   11.2.0               h1234567_1    defaults
libstdcxx-ng              11.2.0               h1234567_1    defaults
llvmlite                  0.39.1                   pypi_0    pypi
markdown                  3.4.1                    pypi_0    pypi
markupsafe                2.1.2                    pypi_0    pypi
mujoco                    2.3.0                    pypi_0    pypi
ncurses                   6.4                  h6a678d5_0    defaults
numba                     0.56.4                   pypi_0    pypi
numpy                     1.23.5                   pypi_0    pypi
nvidia-cublas-cu11        11.10.3.66               pypi_0    pypi
nvidia-cuda-nvrtc-cu11    11.7.99                  pypi_0    pypi
nvidia-cuda-runtime-cu11  11.7.99                  pypi_0    pypi
nvidia-cudnn-cu11         8.5.0.96                 pypi_0    pypi
oauthlib                  3.2.2                    pypi_0    pypi
openssl                   1.1.1t               h7f8727e_0    defaults
optree                    0.9.0                    pypi_0    pypi
packaging                 23.0                     pypi_0    pypi
pettingzoo                1.22.3                   pypi_0    pypi
pillow                    9.4.0                    pypi_0    pypi
pip                       23.0.1           py38h06a4308_0    defaults
protobuf                  4.22.1                   pypi_0    pypi
pyasn1                    0.4.8                    pypi_0    pypi
pyasn1-modules            0.2.8                    pypi_0    pypi
pygame                    2.1.0                    pypi_0    pypi
pyopengl                  3.1.6                    pypi_0    pypi
python                    3.8.16               h7a1cb2a_3    defaults
pyyaml                    6.0                      pypi_0    pypi
readline                  8.2                  h5eee18b_0    defaults
requests                  2.28.2                   pypi_0    pypi
requests-oauthlib         1.3.1                    pypi_0    pypi
rsa                       4.9                      pypi_0    pypi
safety-gymnasium          0.1.1                    pypi_0    pypi
setuptools                65.6.3           py38h06a4308_0    defaults
six                       1.16.0                   pypi_0    pypi
sqlite                    3.40.1               h5082296_0    defaults
tensorboard               2.12.0                   pypi_0    pypi
tensorboard-data-server   0.7.0                    pypi_0    pypi
tensorboard-plugin-wit    1.8.1                    pypi_0    pypi
tianshou                  0.4.11                    dev_0    <develop>
tk                        8.6.12               h1ccaba5_0    defaults
torch                     1.13.1                   pypi_0    pypi
tqdm                      4.65.0                   pypi_0    pypi
types-protobuf            4.22.0.2                 pypi_0    pypi
typing-extensions         4.5.0                    pypi_0    pypi
urllib3                   1.26.15                  pypi_0    pypi
werkzeug                  2.2.3                    pypi_0    pypi
wheel                     0.38.4           py38h06a4308_0    defaults
xmltodict                 0.13.0                   pypi_0    pypi
xz                        5.2.10               h5eee18b_1    defaults
zipp                      3.15.0                   pypi_0    pypi
zlib                      1.2.13               h5eee18b_0    defaults

import tianshou, gymnasium as gym, torch, numpy, sys
print(tianshou.__version__, gym.__version__, torch.__version__, numpy.__version__, sys.version, sys.platform)

When I run the off-policy example in https://github.com/thu-ml/tianshou/blob/master/examples/mujoco/ like td3 the speed is unacceptable for 200 epoch cause I am trying to integrate TianShou's API into OmniSafe.

Observations shape: (111,)                                                                    
Actions shape: (8,)                                                                           
Action range: -1.0 1.0                                                                        
Epoch #1:   0%| | 5/5000 [00:07<1:57:17,  1.41s/it, env_step=4, len=0, loss/actor=0.142, l
oss/crit

It is strange, any help will be appreciated. Thanks!

muchvo commented 1 year ago

And this is a long time runed SAC using TianShou, it takes about 10 hours for the first 10 epoch.

Trinkle23897 commented 1 year ago

It shouldn't be so slow, will take a look soon, thanks for reporting this issue!

Trinkle23897 commented 1 year ago

can you run py-spy to get a profile result?

py-spy record -o profile.svg -- python xxx.py

and paste svg here

muchvo commented 1 year ago

can you run py-spy to get a profile result?
py-spy record -o profile.svg -- python xxx.py
and paste svg here

Okey, I will try.

muchvo commented 1 year ago

Observations shape: (111,) Actions shape: (8,) Action range: -1.0 1.0 Aborted (core dumped) Epoch #1: 1%| | 53/5000 [01:34<2:27:50, 1.79s/it, env_step=52, len=0, loss/actor=0.365, los^C

py-spy seems not working for me. It just throw Aborted (core dumped) and the script I chose is going on.

Trinkle23897 commented 1 year ago

is there a profile.svg in the same folder?

muchvo commented 1 year ago

profile I am not sure of the correctness of this graph, because it will throw core dumped in a later stage, I have to terminate it early (no longer than 1 minute).

Trinkle23897 commented 1 year ago

how many envs do you use? what's your command line?

muchvo commented 1 year ago

I am using bash, the command is just

python tianshou/examples/mujoco/mujoco_td3.py

so, all configurations are default.

MischaPanch commented 1 year ago

I'll try to look into it soon, thanks for reporting!

muchvo commented 1 year ago

I'll try to look into it soon, thanks for reporting!

Thanks a lot for your reply! After I changed the machine, this problem has been solved now. I think this is a local problem. I sincerely appreciate your effort in this community.

thu-ml / tianshou

[question] Abnormal running speed of off-policy algorithms on Mujoco environments. #866