JongseongChae / RIME

Implementation of Robust Imitation Learning against Variations in Environment Dynamics
MIT License
76 stars 2 forks source link

How to run Gail algorithm? #1

Open YanbinLin94 opened 1 year ago

YanbinLin94 commented 1 year ago

Hi JongseongChae, I am wondering how to only run Gail algorithm using your code. Could you give me the run example for gail?

JongseongChae commented 1 year ago

Hi, thank you for your interest on our work.

This repository doesn't contain the code running the original GAIL algorithm that is trained on a single interaction (training) environment. I trained the GAIL algorithm on a repository (pytorch-a2c-ppo-acktr), and evaluated it on our perturbed dynamics setting. It is a implementation of the GAIL with gradient penalty, and this is why I didn't upload the code. Please note that I have provided the performance of GAIL algorithm on the result directory of our repository. I think you can easily implement the GAIL on our code and validate your GAIL performance by comparing our GAIL performance.

Thank you!

YanbinLin94 commented 1 year ago

Thank you for your reply. Yes, I have noticed that you provided the performance of GAIL algorithm on the result directory. But the problem is that I want to use gail in the humanoid environment. And it seems that the repository (pytorch-a2c-ppo-acktrhttps://github.com/ikostrikov/pytorch-a2c-ppo-acktr-gail) doesn't have the expert demonstratation of humanoid environment. Do you have any idea how to implement gail quickly in humanoid environment? Because I have run your RIME code in humanoid environment and would like to compare the performance. Thank you very much.

Best, Yanbin

Get Outlook for iOShttps://aka.ms/o0ukef


From: Jongseong Chae @.> Sent: Monday, July 3, 2023 1:02:31 AM To: JongseongChae/RIME @.> Cc: Yanbin Lin @.>; Author @.> Subject: Re: [JongseongChae/RIME] How to run Gail algorithm? (Issue #1)

            EXTERNAL EMAIL : Exercise caution when responding, opening links, or opening attachments.

Hi, thank you for your interest on our work.

This repository doesn't contain the code running the original GAIL algorithm that is trained on a single interaction (training) environment. I trained the GAIL algorithm on a repository (pytorch-a2c-ppo-acktrhttps://github.com/ikostrikov/pytorch-a2c-ppo-acktr-gail), and evaluated it on our perturbed dynamics setting. It is a implementation of the GAIL with gradient penalty, and this is why I didn't upload the code. Please note that I have provided the performance of GAIL algorithm on the result directory of our repository. I think you can easily implement the GAIL on our code and validate your GAIL performance by comparing our GAIL performance.

Thank you!

— Reply to this email directly, view it on GitHubhttps://github.com/JongseongChae/RIME/issues/1#issuecomment-1617326443, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AXKTXOL6HB7XPRSL3MLUODLXOJG6PANCNFSM6AAAAAAZ3V7YOU. You are receiving this because you authored the thread.Message ID: @.***>

JongseongChae commented 1 year ago

I'm sorry I can't give you. I also don't run our algorithm on the Humanoid domain. Instead, I'd like to give you my idea based on my experience.

  1. Generate expert demonstration using SAC or PPO via various repository satisfying your need (pytorch-a2c-ppo-acktr or OpenAI Baseline or SAC implementations).
  2. Check your generated demonstration; You should check that your demonstration seems to be generated from a reasonable expert policy. For example, the trajectory of the demonstration is of full horizon, or the trajectories seem to be similar in visualization. To my experience, a stationary policy even generates weird trajectories on some states. This process may be difficult and time-consumed.
  3. Implement algorithms trained on a single interaction environment; refer to our code and pytorch-a2c-ppo-acktr.

Thank you.

YanbinLin94 commented 1 year ago

Thank you very much. I will have a try. If I have the result, I will let you know .

Best, Yanbin

Get Outlook for iOShttps://aka.ms/o0ukef


From: Jongseong Chae @.> Sent: Monday, July 3, 2023 1:33:12 AM To: JongseongChae/RIME @.> Cc: Yanbin Lin @.>; Author @.> Subject: Re: [JongseongChae/RIME] How to run Gail algorithm? (Issue #1)

            EXTERNAL EMAIL : Exercise caution when responding, opening links, or opening attachments.

I'm sorry I can't give you. I also don't run our algorithm on our Humanoid domain. Instead, I'd like to give you my idea based on my experience.

  1. Generate expert demonstration using SAC or PPO via various repository satisfying your need (pytorch-a2c-ppo-acktrhttps://github.com/ikostrikov/pytorch-a2c-ppo-acktr-gail or OpenAI Baseline or SAC implementations).
  2. Check your generated demonstration; You should check that your demonstration seems to be generated from a reasonable expert policy. For example, the trajectory of the demonstration is of full horizon, or the trajectories seem to be similar in visualization. To the my experience, a stationary policy even generates weird trajectories on some states. This process may be difficult and time-consumed.
  3. Implement algorithms trained on a single interaction environment; refer to our code and pytorch-a2c-ppo-acktrhttps://github.com/ikostrikov/pytorch-a2c-ppo-acktr-gail.

Thank you.

— Reply to this email directly, view it on GitHubhttps://github.com/JongseongChae/RIME/issues/1#issuecomment-1617385707, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AXKTXOKBAB5EJRJXJDUC3V3XOJKRRANCNFSM6AAAAAAZ3V7YOU. You are receiving this because you authored the thread.Message ID: @.***>