ScheiklP / sofa_zoo

Reinforcement learning scripts for sofa_env environments.
MIT License
5 stars 5 forks source link

tissue_manipulation: visualization #4

Closed wjyustl closed 6 months ago

wjyustl commented 6 months ago

Hi! Sorry to disturb you. I'm having some problems.

Snipaste_2023-11-10_16-34-33 There is no Image Desired Point / visual target in the training process. I am sure that "with_visual_target" = "True" and "COLOR_VIS_TARGET" = "(51 / 255.0, 223 / 255.0, 255 / 255.0)" in "scene_description.py".

  1. Snipaste_2023-11-10_16-43-23 How can i find the correct success rate to draw the Spatial Reasoning Track? I can only find the ep_successful_task_mean when i use wandb.

ScheiklP commented 6 months ago

Hi @wjyustl, no worries! :)

  1. If I am not mistaken, by default the target get's added to the image observation after rendering (https://github.com/ScheiklP/sofa_env/blob/main/sofa_env/scenes/tissue_manipulation/tissue_manipulation_env.py#L532). Could you run python3 sofa_env_devel/sofa_env/scenes/tissue_manipulation/tissue_manipulation_env.py to verify, that the target is added to the RGB observation?

  2. Yes, the information about task success is stored in ep_successful_task_mean. To get the plot from the paper, we used the wandb API to download the data, and then plotted it with TikZ/pgfplots.

wjyustl commented 6 months ago

Hi @ScheiklP, Sorry for my late response.

  1. I set the "debug" of tissue_manipulation_env.py to False and it runs successfully. https://github.com/ScheiklP/sofa_env/blob/main/sofa_env/scenes/tissue_manipulation/tissue_manipulation_env.py#L749. But it still presents the problems described earlier. Snipaste_2023-11-13_10-17-31

  2. Can you give further instructions on how to use wandb?

ScheiklP commented 6 months ago

Hi @wjyustl

  1. What happens if you set debug to true?
  2. The documentation is quite good https://docs.wandb.ai/guides/track/public-api-guide
wjyustl commented 6 months ago

Hi, @ScheiklP, If setting debug to True, it will be like: Snipaste_2023-11-13_17-23-24 in base.py.

ScheiklP commented 6 months ago

That is a super weird error. Could you tell me more about the system?

wjyustl commented 6 months ago

@ScheiklP I know very little about the system. When I run tissue_manipulation_env.py, I get this error. Then I find debug, a variable I haven't seen before, so I change it from True to False, and it run without error. It's just another strange phenomenon where the target point can't be seen when rendering.

ScheiklP commented 6 months ago

With system I mean is it Ubuntu, WSL on Windows, are you using an NVIDIA or AMD GPU, stuff like that. :)

wjyustl commented 6 months ago

With system I mean is it Ubuntu, WSL on Windows, are you using an NVIDIA or AMD GPU, stuff like that. :)

I'm sorry I misunderstood. I am using Ubuntu system with NVIDIA GPU.

ScheiklP commented 6 months ago

Could you check if the installed packages are at the following versions?

gymnasium                     0.28.1
numpy                         1.24.4
open3d                        0.17.0
opencv-python                 4.8.0.74
pybind11                      2.10.4
pybind11-global               2.10.4
pygame                        2.5.1
pyglet                        1.5.21
PyOpenGL                      3.1.7
PyOpenGL-accelerate           3.1.7
wjyustl commented 6 months ago

@ScheiklP The following is the version of my installed packages: gymnasium 0.29.1 numpy 1.26.1 open3d 0.17.0 opencv-python 4.8.1.78 pybind11 2.6.1 pybind11-global 2.6.1 pygame 2.5.2 pyglet 1.5.21 pyopengl 3.1.7 pyopengl-accelerate 3.1.7

ScheiklP commented 6 months ago

And you do have the appropriate drivers installed for your GPU, right?

wjyustl commented 6 months ago

And you do have the appropriate drivers installed for your GPU, right?

Of course, any other envs or tasks can be trained normally.

ScheiklP commented 6 months ago

And the other environments also work with RenderMode.HEADLESS?

wjyustl commented 6 months ago

No. I choose RenderMode.HUMAN for visualization.

ScheiklP commented 6 months ago

Ok, so the error also occurs, if you run a different environment with RenderMode.HEADLESS? It really looks like a pyglet error, more than a sofa_env error. I am a bit confused, why screen does not return a config correctly.

wjyustl commented 6 months ago

@ScheiklP Thank you for your explanation. Tomorrow I will try to change the version of pyglet or some other package. I will continue to reply to you in the future.

ScheiklP commented 6 months ago

Thanks! :)

wjyustl commented 6 months ago

Hi @ScheiklP, Sorry disturbing you again.

  1. I have changed most of the package versions. But if I use the RenderMode.HEADLESS, the previous NoSuchConfigException error will occur. It's worth noting that maybe there are some problems with the cpp? Snipaste_2023-11-14_10-03-02

  2. I use the wandb.ai to draw a picture of successful_task but it doesn't seem right. The values of successful_task appear to be discrete and oscillate wildly. This seems at odds with the image in the article. Snipaste_2023-11-14_10-04-06

ScheiklP commented 6 months ago

No worries.

  1. That really looks like this error: https://github.com/pyglet/pyglet/issues/51 Sadly, I have never encountered this error, so I am not sure how I can help you there. It really looks more like a problem with your current setup of drivers etc. The [ERROR] messages can be ignored. I have to update some templates to match the new SOFA logic.

  2. The figures in the paper are mean and standard deviation over 8 random seeds, so it is by nature a bit smoother. The image that you sent looks correct. RL is in general not as stable as e.g. supervised learning.

wjyustl commented 6 months ago

@ScheiklP Thanks. I'm also currently using 8 random seeds as "number_of_envs" in CONFIG. However,

  1. The successful_task seems to be discrete. For 8 tasks, the value is set to 1 for success and to 0 for failures. So the result value of s will always belong to the set {0, 0.125, 0.25, 0.375, 0.5, 0.625, 0.75, 0.875, 1}.
  2. I set the total_timesteps as 3 million. But the X-axis in the image is Step instead of num_timesteps. That's not what I wanted to get.
ScheiklP commented 6 months ago

The number_of_envs is the number of environments that are used to collect samples in the environment. So 1 learning run is trained on samples from 8 environments (for parallelization). Each time a rollout reaches a terminal state, the information is added to the model.ep_info_buffer in SB3 and the environment is reset to start the next rollout. The figures in the paper are 8 learning runs (with different random seed each).

wjyustl commented 6 months ago

Yes. I mean when the same number of envs is 8, why does the image in your paper appear to have a continuous success rate, rather than belonging to set {0, 0.125, 0.25, 0.375, 0.5, 0.625, 0.75, 0.875, 1} like my attempt?

ScheiklP commented 6 months ago

Because the values are averaged over 8 independent runs. I do agree it is a bit unlucky that both the number of parallel envs, as well as the number of independent runs is 8.

wjyustl commented 6 months ago

Thank you very much. I will try again in a moment to get the right results.