instadeepai / og-marl

Datasets with baselines for offline multi-agent reinforcement learning.
https://instadeepai.github.io/og-marl/
Apache License 2.0
149 stars 13 forks source link

Can't get dataset `omar` #52

Closed ZijunSong closed 3 weeks ago

ZijunSong commented 1 month ago

Hello, when I try to get the dataset omar from your Google Drive: https://drive.google.com/drive/folders/11Crh-zuqTWGR0Rf0BvS-OGfmT6vQlYrW, I notice that the omar folder is empty. And after I download the dataset from the original author, I notice that the structure of omar dataset is different from datasets like smac, which leads to error in code. I hope to receive your help! Thank you!

jcformanek commented 1 month ago

Hi, if you are using the baselines branch, then the dataset URLs are no longer valid. Please use the URLs on the main branch or download the relevant dataset directly from hugging face. https://huggingface.co/datasets/InstaDeepAI/og-marl/tree/main/prior_work/omar/mamujoco

ZijunSong commented 4 weeks ago

Thank you! I have downloaded your datasets. I noticed that some of the table entries in your paper for the MPE environment are marked as 'Not available'. Could you share the reason for this? Additionally, I see that the table indicates the evaluator is the mean and standard deviation, but the provided code seems to output win rate and episode return. Did I overlook any part of your code? Or is using episode return sufficient as the evaluator?

微信图片_20241104091750
jcformanek commented 4 weeks ago

Hi, we struggled to get the environment working on the PP and WD scenarios because they are actually competitive scenarios and the authors seem to use a pre-trained Torch model to control the adversaries. In the end we gave up trying to run those scenarios. However we did convert the datasets. If you want to try getting those scenarios to work, we would appreciate you sharing your solution with us.

The OMAR authors seem to normalise their raw episode return values by dividing them by the mean of the dataset. This is what we gather from their code. We similarly normalised our raw episode returns before adding them to the table.

You can get the mean of the datasets using our vault analysing utilities which are available on the main branch.

ZijunSong commented 3 weeks ago

Thank you very much for your response! I’ll look into ways to address this issue within the OMAR environment and would be glad to share my findings with you once I have a solution. I’ll also continue testing in other environments. I noticed that for the MAMuJoCo environment, it seems that only the 2halfcheetah data is available on the Hugging Face link. Would it be possible to access the data listed in the table 12? [

微信图片_20241105093645

](url) Thank you for your help!

ZijunSong commented 3 weeks ago

When I using the 2halfcheetah dataset, I met an error:

Traceback (most recent call last):
  File "/personal/og-marl-baselines-code/main.py", line 64, in <module>
    app.run(main)
  File "/opt/mamba/envs/baselines210/lib/python3.9/site-packages/absl/app.py", line 308, in run
    _run_main(main, args)
  File "/opt/mamba/envs/baselines210/lib/python3.9/site-packages/absl/app.py", line 254, in _run_main
    sys.exit(main(argv))
  File "/personal/og-marl-baselines-code/main.py", line 56, in main
    system.train_offline(
  File "/personal/og-marl-baselines-code/systems/base.py", line 98, in train_offline
    train_logs = self.train_step(experience)
  File "/personal/og-marl-baselines-code/systems/iddpg.py", line 201, in train_step
    logs = self._tf_train_step(experience)
  File "/opt/mamba/envs/baselines210/lib/python3.9/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/tmp/__autograph_generated_filedan3mwyy.py", line 42, in tf___tf_train_step
    target_qs_1 = ag__.converted_call(ag__.ld(self)._target_critic_network_1, (ag__.ld(env_states), ag__.ld(target_actions)), None, fscope)
  File "/tmp/__autograph_generated_filepw71n4kn.py", line 15, in tf___decorate_unbound_method
    retval_ = ag__.converted_call(ag__.ld(decorator_fn), (ag__.ld(bound_method), ag__.ld(self), ag__.ld(args), ag__.ld(kwargs)), None, fscope)
  File "/tmp/__autograph_generated_filewajbqxf8.py", line 47, in tf__wrap_with_name_scope
    retval_ = ag__.converted_call(ag__.ld(method), tuple(ag__.ld(args)), dict(**ag__.ld(kwargs)), fscope)
  File "/tmp/__autograph_generated_fileu60ng51a.py", line 47, in tf____call__
    critic_input = ag__.converted_call(ag__.ld(tf).concat, ([ag__.ld(states), ag__.ld(agent_actions)],), dict(axis=-1), fscope)
ValueError: in user code:

    File "/personal/og-marl-baselines-code/systems/iddpg_bc.py", line 91, in _tf_train_step  *
        target_qs_1 = self._target_critic_network_1(env_states, target_actions)
    File "/opt/mamba/envs/baselines210/lib/python3.9/site-packages/sonnet/src/utils.py", line 85, in _decorate_unbound_method  *
        return decorator_fn(bound_method, self, args, kwargs)
    File "/opt/mamba/envs/baselines210/lib/python3.9/site-packages/sonnet/src/base.py", line 262, in wrap_with_name_scope  *
        return method(*args, **kwargs)
    File "/personal/og-marl-baselines-code/systems/iddpg.py", line 70, in __call__  *
        critic_input = tf.concat([states, agent_actions], axis=-1)

    ValueError: Shape must be rank 5 but is rank 4 for '{{node state_and_action_critic/concat}} = ConcatV2[N=2, T=DT_FLOAT, Tidx=DT_INT32](state_and_action_critic/stack, Reshape_2, state_and_action_critic/concat/axis)' with input shapes: [20,64,2,2,17], [20,64,2,3], [].

I'm unsure whether it is due to my own actions or if it is a problem with the code itself. If it turns out to be my mistake, I will continue debugging on my end. Thank you very much for your assistance!

jcformanek commented 3 weeks ago

Hi the problem was that we replicated the states for each agent in that dataset. Which is not what the code expects. Thats why there is an extra dimension that is causing the problem. I have fixed the dataset and uploaded it to HuggingFace. Please just redownload the latest version of the dataset.

ZijunSong commented 3 weeks ago

Thank you very much for your hard work and dedication. I apologize for the inconvenience, but I’ve encountered another error:

Traceback (most recent call last):
  File "/personal/og-marl-baselines-code/main.py", line 64, in <module>
    app.run(main)
  File "/opt/mamba/envs/baselines210/lib/python3.9/site-packages/absl/app.py", line 308, in run
    _run_main(main, args)
  File "/opt/mamba/envs/baselines210/lib/python3.9/site-packages/absl/app.py", line 254, in _run_main
    sys.exit(main(argv))
  File "/personal/og-marl-baselines-code/main.py", line 56, in main
    system.train_offline(
  File "/personal/og-marl-baselines-code/systems/base.py", line 98, in train_offline
    train_logs = self.train_step(experience)
  File "/personal/og-marl-baselines-code/systems/iddpg.py", line 201, in train_step
    logs = self._tf_train_step(experience)
  File "/opt/mamba/envs/baselines210/lib/python3.9/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/tmp/__autograph_generated_filelzte_b39.py", line 51, in tf___tf_train_step
    online_actions = ag__.converted_call(ag__.ld(unroll_rnn), (ag__.ld(self)._policy_network, ag__.converted_call(ag__.ld(merge_batch_and_agent_dim_of_time_major_sequence), (ag__.ld(observations),), None, fscope), ag__.converted_call(ag__.ld(merge_batch_and_agent_dim_of_time_major_sequence), (ag__.ld(resets),), None, fscope)), None, fscope)
  File "/tmp/__autograph_generated_file3bwc7ddy.py", line 29, in tf__unroll_rnn
    ag__.for_stmt(ag__.converted_call(ag__.ld(range), (ag__.ld(T),), None, fscope), None, loop_body, get_state, set_state, ('hidden_state',), {'iterate_names': 'i'})
  File "/tmp/__autograph_generated_file3bwc7ddy.py", line 24, in loop_body
    (output, hidden_state) = ag__.converted_call(ag__.ld(rnn_network), (ag__.ld(inputs)[ag__.ld(i)], ag__.ld(hidden_state)), None, fscope)
  File "/tmp/__autograph_generated_file3jnehif4.py", line 15, in tf___decorate_unbound_method
    retval_ = ag__.converted_call(ag__.ld(decorator_fn), (ag__.ld(bound_method), ag__.ld(self), ag__.ld(args), ag__.ld(kwargs)), None, fscope)
  File "/tmp/__autograph_generated_file_gt0mqtw.py", line 47, in tf__wrap_with_name_scope
    retval_ = ag__.converted_call(ag__.ld(method), tuple(ag__.ld(args)), dict(**ag__.ld(kwargs)), fscope)
  File "/tmp/__autograph_generated_file7l5jft5v.py", line 78, in tf____call__
    ag__.for_stmt(ag__.converted_call(ag__.ld(enumerate), (ag__.ld(self)._layers,), None, fscope), None, loop_body, get_state_3, set_state_3, ('current_inputs', 'recurrent_idx'), {'iterate_names': '(idx, layer)'})
  File "/tmp/__autograph_generated_file7l5jft5v.py", line 61, in loop_body
    ag__.if_stmt(ag__.converted_call(ag__.ld(isinstance), (ag__.ld(layer), ag__.ld(RNNCore)), None, fscope), if_body_1, else_body_1, get_state_1, set_state_1, ('current_inputs', 'recurrent_idx'), 2)
  File "/tmp/__autograph_generated_file7l5jft5v.py", line 60, in else_body_1
    current_inputs = ag__.converted_call(ag__.ld(layer), (ag__.ld(current_inputs),), None, fscope)
  File "/tmp/__autograph_generated_file3jnehif4.py", line 15, in tf___decorate_unbound_method
    retval_ = ag__.converted_call(ag__.ld(decorator_fn), (ag__.ld(bound_method), ag__.ld(self), ag__.ld(args), ag__.ld(kwargs)), None, fscope)
  File "/tmp/__autograph_generated_file_gt0mqtw.py", line 47, in tf__wrap_with_name_scope
    retval_ = ag__.converted_call(ag__.ld(method), tuple(ag__.ld(args)), dict(**ag__.ld(kwargs)), fscope)
  File "/tmp/__autograph_generated_filemyqmo5la.py", line 11, in tf____call__
    outputs = ag__.converted_call(ag__.ld(tf).matmul, (ag__.ld(inputs), ag__.ld(self).w), None, fscope)
ValueError: in user code:

    File "/personal/og-marl-baselines-code/systems/iddpg_bc.py", line 118, in _tf_train_step  *
        online_actions = unroll_rnn(
    File "/personal/og-marl-baselines-code/utils/utils.py", line 104, in unroll_rnn  *
        output, hidden_state = rnn_network(inputs[i], hidden_state)  # type: ignore
    File "/opt/mamba/envs/baselines210/lib/python3.9/site-packages/sonnet/src/utils.py", line 85, in _decorate_unbound_method  *
        return decorator_fn(bound_method, self, args, kwargs)
    File "/opt/mamba/envs/baselines210/lib/python3.9/site-packages/sonnet/src/base.py", line 262, in wrap_with_name_scope  *
        return method(*args, **kwargs)
    File "/opt/mamba/envs/baselines210/lib/python3.9/site-packages/sonnet/src/recurrent.py", line 583, in __call__  *
        current_inputs = layer(current_inputs)
    File "/opt/mamba/envs/baselines210/lib/python3.9/site-packages/sonnet/src/utils.py", line 85, in _decorate_unbound_method  *
        return decorator_fn(bound_method, self, args, kwargs)
    File "/opt/mamba/envs/baselines210/lib/python3.9/site-packages/sonnet/src/base.py", line 262, in wrap_with_name_scope  *
        return method(*args, **kwargs)
    File "/opt/mamba/envs/baselines210/lib/python3.9/site-packages/sonnet/src/linear.py", line 85, in __call__  *
        outputs = tf.matmul(inputs, self.w)

    ValueError: Dimensions must be equal, but are 8 and 15 for '{{node linear/MatMul_52}} = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false](strided_slice_45, linear/MatMul_52/ReadVariableOp)' with input shapes: [128,8], [15,128].
jcformanek commented 3 weeks ago

Can you give me some information on how to reproduce this? What scenario and dataset are you trying to run?

On Wed, 06 Nov 2024, 03:32 Zijun Song, @.***> wrote:

Thank you very much for your hard work and dedication. I apologize for the inconvenience, but I’ve encountered another error:

Traceback (most recent call last): File "/personal/og-marl-baselines-code/main.py", line 64, in app.run(main) File "/opt/mamba/envs/baselines210/lib/python3.9/site-packages/absl/app.py", line 308, in run _run_main(main, args) File "/opt/mamba/envs/baselines210/lib/python3.9/site-packages/absl/app.py", line 254, in _run_main sys.exit(main(argv)) File "/personal/og-marl-baselines-code/main.py", line 56, in main system.train_offline( File "/personal/og-marl-baselines-code/systems/base.py", line 98, in train_offline train_logs = self.train_step(experience) File "/personal/og-marl-baselines-code/systems/iddpg.py", line 201, in train_step logs = self._tf_train_step(experience) File "/opt/mamba/envs/baselines210/lib/python3.9/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler raise e.with_traceback(filtered_tb) from None File "/tmp/autograph_generated_filelzteb39.py", line 51, in tftf_train_step online_actions = ag.converted_call(ag__.ld(unroll_rnn), (ag.ld(self)._policy_network, ag.converted_call(ag.ld(merge_batch_and_agent_dim_of_time_major_sequence), (ag.ld(observations),), None, fscope), ag__.converted_call(ag.ld(merge_batch_and_agent_dim_of_time_major_sequence), (ag.ld(resets),), None, fscope)), None, fscope) File "/tmp/autograph_generated_file3bwc7ddy.py", line 29, in tfunroll_rnn ag__.for_stmt(ag.converted_call(ag.ld(range), (ag.ld(T),), None, fscope), None, loop_body, get_state, set_state, ('hidden_state',), {'iterate_names': 'i'}) File "/tmp/autograph_generated_file3bwc7ddy.py", line 24, in loop_body (output, hidden_state) = ag.converted_call(ag.ld(rnn_network), (ag.ld(inputs)[ag.ld(i)], ag__.ld(hidden_state)), None, fscope) File "/tmp/autograph_generated_file3jnehif4.py", line 15, in tf_decorate_unboundmethod retval = ag.converted_call(ag.ld(decorator_fn), (ag__.ld(bound_method), ag.ld(self), ag.ld(args), ag.ld(kwargs)), None, fscope) File "/tmp/autograph_generated_file_gt0mqtw.py", line 47, in tfwrap_with_namescope retval = ag.converted_call(ag.ld(method), tuple(ag.ld(args)), dict(**ag.ld(kwargs)), fscope) File "/tmp/autograph_generated_file7l5jft5v.py", line 78, in tf__call ag.for_stmt(ag.converted_call(ag.ld(enumerate), (ag.ld(self)._layers,), None, fscope), None, loop_body, get_state_3, set_state_3, ('current_inputs', 'recurrent_idx'), {'iterate_names': '(idx, layer)'}) File "/tmp/autograph_generated_file7l5jft5v.py", line 61, in loop_body ag.if_stmt(ag__.converted_call(ag.ld(isinstance), (ag.ld(layer), ag.ld(RNNCore)), None, fscope), if_body_1, else_body_1, get_state_1, set_state_1, ('current_inputs', 'recurrent_idx'), 2) File "/tmp/autograph_generated_file7l5jft5v.py", line 60, in else_body_1 current_inputs = ag__.converted_call(ag.ld(layer), (ag.ld(current_inputs),), None, fscope) File "/tmp/__autograph_generatedfile3jnehif4.py", line 15, in tfdecorate_unboundmethod retval = ag.converted_call(ag__.ld(decorator_fn), (ag.ld(bound_method), ag.ld(self), ag.ld(args), ag.ld(kwargs)), None, fscope) File "/tmp/autograph_generated_file_gt0mqtw.py", line 47, in tfwrap_with_namescope retval = ag.converted_call(ag.ld(method), tuple(ag.ld(args)), dict(**ag.ld(kwargs)), fscope) File "/tmp/autograph_generated_filemyqmo5la.py", line 11, in tf__call outputs = ag__.converted_call(ag.ld(tf).matmul, (ag.ld(inputs), ag__.ld(self).w), None, fscope) ValueError: in user code:

File "/personal/og-marl-baselines-code/systems/iddpg_bc.py", line 118, in _tf_train_step  *
    online_actions = unroll_rnn(
File "/personal/og-marl-baselines-code/utils/utils.py", line 104, in unroll_rnn  *
    output, hidden_state = rnn_network(inputs[i], hidden_state)  # type: ignore
File "/opt/mamba/envs/baselines210/lib/python3.9/site-packages/sonnet/src/utils.py", line 85, in _decorate_unbound_method  *
    return decorator_fn(bound_method, self, args, kwargs)
File "/opt/mamba/envs/baselines210/lib/python3.9/site-packages/sonnet/src/base.py", line 262, in wrap_with_name_scope  *
    return method(*args, **kwargs)
File "/opt/mamba/envs/baselines210/lib/python3.9/site-packages/sonnet/src/recurrent.py", line 583, in __call__  *
    current_inputs = layer(current_inputs)
File "/opt/mamba/envs/baselines210/lib/python3.9/site-packages/sonnet/src/utils.py", line 85, in _decorate_unbound_method  *
    return decorator_fn(bound_method, self, args, kwargs)
File "/opt/mamba/envs/baselines210/lib/python3.9/site-packages/sonnet/src/base.py", line 262, in wrap_with_name_scope  *
    return method(*args, **kwargs)
File "/opt/mamba/envs/baselines210/lib/python3.9/site-packages/sonnet/src/linear.py", line 85, in __call__  *
    outputs = tf.matmul(inputs, self.w)

ValueError: Dimensions must be equal, but are 8 and 15 for '{{node linear/MatMul_52}} = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false](strided_slice_45, linear/MatMul_52/ReadVariableOp)' with input shapes: [128,8], [15,128].

— Reply to this email directly, view it on GitHub https://github.com/instadeepai/og-marl/issues/52#issuecomment-2458523732, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKFXOXV6GZ3XU6TYIEFID33Z7FWTTAVCNFSM6AAAAABRBO6ZNKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINJYGUZDGNZTGI . You are receiving this because you were assigned.Message ID: @.***>

ZijunSong commented 3 weeks ago

OK! Still the 2halfcheetah dataset:

微信图片_20241106140530

When the system is maddpg+cql, I meet the same error.

jcformanek commented 3 weeks ago

Do you want to run the 2halfcheetah dataset from omar or og-marl. Because on the baselines-code branch you need to set FLAGS.env=mamujoco_omar and make sure you are using MuJoCo version 200 for omar.

I also released that there are some environment files missing on this branch, which I have now pushed.

But other than that, I am struggling to reproduce this problem. It seems to be working on my side. Is there any other information you could share with me that might help diagnose the problem?

ZijunSong commented 3 weeks ago

Thank you very much for your patient support. I’ve successfully clarified the relationship between the environment and the data, and the experiment is now running smoothly. I truly appreciate your outstanding work and the time you've taken to assist me. Thank you once again!

jcformanek commented 3 weeks ago

Oh that is great! Very happy you got it working. Please feel free to ask any further questions. I am happy to help.

By the way, can you at some point switch to using the main branch? Its very helpful to get user feedback on the code. I would like to stop maintaining this branch in favour of the main branch.