google-research / rl-reliability-metrics

The RL Reliability Metrics library provides a set of metrics for measuring the reliability of reinforcement learning (RL) algorithms, as well as statistical tools for comparing algorithms and for computing confidence intervals on these metrics.
Apache License 2.0
162 stars 21 forks source link

Issues with the instructions for running the mujoco example #3

Closed SirBuster33 closed 4 years ago

SirBuster33 commented 4 years ago

Hi there!

We're 5th semester bachelor students and would like to use your performance metrics in analyzing different versions of our DQN implementation for the LunarLander-V2 gym. Unfortunately, we got stuck when trying to make the example for mujoco work, namely right at the beginning (Step 0-3 ish, we're super confused where we currently are at).

Would it be possible for you to guide us through this example? :-)

-- Philipp

scychan commented 4 years ago

Hi Philipp,

Great to hear that you are using the metrics! Could you tell me a bit more about where and how you are stuck? Are you using Tensorflow or Pytorch, by any chance?

Best, Stephanie

On Thu, Nov 26, 2020 at 11:04 AM SirBuster33 notifications@github.com wrote:

Hi there!

We're 5th semester bachelor students and would like to use your performance metrics in analyzing different versions of our DQN implementation for the LunarLander-V2 gym. Unfortunately, we got stuck when trying to make the example for mujoco work, namely right at the beginning (Step 0-3 ish, we're super confused where we currently are at).

Would it be possible for you to guide us through this example? :-)

-- Philipp

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/google-research/rl-reliability-metrics/issues/3, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABVH5HZD5LWP3GKPFN7YB6TSRZ4CRANCNFSM4UD5DT2Q .

SirBuster33 commented 4 years ago

Hi Stephanie,

To be more specific, we are currently using Keras. I think we did indeed get stuck at the very first step in your example (step 0). What I am unsure about is how to use the code you provide, and where to write it into, for example, when trying to execute the code in the python command prompt, this happened (You can also see it in the attachments):

BASE_DIR="$HOME/rl_reliability_metrics/tf_agents_mujoco_expts" or tar -xvzf tf_agents_example_dataset.tgz

could not be executed as "BASE_DIR" and "tar" were not recognized. Is the code supposed to be executed this way or am I doing it completely wrong?

Best, Philipp

Python_Trying_Step_0

scychan commented 4 years ago

Hi Philipp, The commands in the example are all Bash/shell commands (for unix), but it looks like you are running on a Windows machine. Unfortunately we did assume that users would be using unix, but you can find the equivalent commands for Windows. E.g. "tar" is just the command for extracting from the compressed .tgz file. Alternatively I believe that you can try running on a remote machine e.g. on Google Cloud. Best, Stephanie

SirBuster33 commented 4 years ago

Hi Stephanie,

Alright, I'll try to see whether we can make it work with Google Cloud instead then. Thank you very much for the support, I hope I did not bother you too much with my trivial questions (since I am still pretty new to CS as a topic) :-P

Best, Philipp

scychan commented 4 years ago

Good luck! We all started at that point, at some time :)

On Mon, Nov 30, 2020, 4:21 AM SirBuster33 notifications@github.com wrote:

Hi Stephanie,

Alright, I'll try to see whether we can make it work with Google Cloud instead then. Thank you very much for the support, I hope I did not bother you too much with my trivial questions (since I am still pretty new to CS as a topic) :-P

Best, Philipp

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/google-research/rl-reliability-metrics/issues/3#issuecomment-735662822, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABVH5H62SXVALOBILIWTRZDSSNP3FANCNFSM4UD5DT2Q .