DavidKoleczek / human_marl

Cooperative Multi Agent Reinforcement Learning with Human in the Loop
11 stars 1 forks source link

RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x14 and 8x64) #7

Closed wanran-cell closed 1 year ago

wanran-cell commented 1 year ago

When I run alphaVSintervention_swarm.py,there is a error RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x14 and 8x64) Hope can get your help.

image image

WeihaoTan commented 1 year ago

Hi. The observation of the original Lunar Lander is 8 dims. Because we add the human action(6 dims) to the observation. So our observation is 14 dims. Note that we train a simulated agent to simulate real human action at first, using the original env, whose obs is 8 dims. And then we train an assistive agent(copilot) to help it. The obs of this assistive agent should be 14 dims. The function contains "co", such as make_co_env and co_DDQN_agent means that we use it to train the copilot. Thus, the obs has 14 dims, otherwise, the obs has 8 dims. It seems that the functions you use to create agent and env are mismatched.

wanran-cell commented 1 year ago

Thank you very much for your patient and detailed explanation, which makes me have a deeper understanding of the paper. Now I am training a simulated agent, very grateful for your help.

wanran-cell commented 1 year ago

Hi, when I run the lander_sac_bounds.py, the error occurred.

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() image

Your work is very awesome and admirable, could you instruct the error and executable scripts running order in the experiments folder? Hope your help. Thank you a lot!

WeihaoTan commented 1 year ago

@DavidKoleczek can you help him?

wanran-cell commented 1 year ago

Hi, I want to consult a problem again.😵😵 First, I run the full_pilot_lunarlander.py to train the agent. Next, I tried to run co_polit_lunarlander.py to train copilot, but there is some wrong that can't find FileNotFoundError. The error: [Errno 2] No such file or directory: './saved_models/intervention_penalty/noisy_pilot_alpha_0.2.pkl' Traceback (most recent call last): File "E:/ShareControl/human_marl-main/human_marl-main/experiments/co_pilot_lunarlander.py", line 150, in checkpoint = torch.load(PATH) File "D:\ProgramData\Anaconda\envs\OIS\lib\site-packages\torch\serialization.py", line 579, in load with _open_file_like(f, 'rb') as opened_file: File "D:\ProgramData\Anaconda\envs\OIS\lib\site-packages\torch\serialization.py", line 230, in _open_file_like return _open_file(name_or_buffer, mode) File "D:\ProgramData\Anaconda\envs\OIS\lib\site-packages\torch\serialization.py", line 211, in init super(_open_file, self).init(open(name, mode)) FileNotFoundError: [Errno 2] No such file or directory: './saved_models/intervention_penalty/noisy_pilot_alpha_0.2.pkl'

I think it maybe the copilot model is not saved. I try to modify in lunar_lander_environment.py. But it is not work. Then I try to run alphaVSintervention_swarm.py, the previous error also still. RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x14 and 8x64). And the penaltyVSintervention_swarm.py also has the mat1 and mat2 shapes error. The error in lander_sac_bounds.py above-mentioned, I didn't solve it either.

In addition, the code in rewardVSbudget_swarm.py in line87, put the "../.." modify "../." and it can save .pkl file well. File path is./saved_models/. I enjoyed other .py, that feel good. Your work is very cool.👍👍 Looking forward to your help and advice. Thank you for your time.🙏🙏

WeihaoTan commented 1 year ago

Hi, thanks for your interest in our work. And sorry for the delayed response. @DavidKoleczek seems unavailable, who is responsible for the SAC code. I will attempt to help you solve the rest of the issues.

co_pilot_lunarlander.py also contains the code to train pilot. You may not need to use full_pilot_lunarlander.py to train an independent pilot. If you want to train a copilot. You need to have a pilot at first. You may need to change line25: load_pretrained_full_pilot = True to False to train your own pilot. And make sure the path is correct. It may fail on different systems. Try to use absolute path. After you get your own pilot, you can use it to train or load a copilot. If you want to train it, change line80: load_pretrained_co_pilot = True to False. So you can change both line25 and line80 to False in co_pilot_lunarlander.py, simultaneously. It will train the pilot and use the pilot to train the copilot automatically and save both models to the corresponding folder.

For the issue of size doesn't match. It seems you still mismatched pilot and copilot. You may need to check whether you load pilot using pilot function and load copilot using copilot function and use the corresponding function to deal with each model.

wanran-cell commented 1 year ago

Hi, it does not matter. Thanks for your hard work, patient, and careful response. You are my hero in share autonomy for a new researcher. I changed both line25 and line80 simultaneously. But.. the issue mat1 and mat2 mismatch still existed. I try to debug and seem the code exp_co_pilot.train(frames=frames) is not very well in line120 in co_pilot_lunarlander.py . image image When I changed the line120 from exp_co_pilot.train(frames=frames) to exp_co_pilot.intervention_train(frames=frames),the same issue exist. 1682271482880 I don't know what I can say to thank you ,thanks for all.🌈🌈

WeihaoTan commented 1 year ago

Weird. I just git clone the code and run it from scratch. it works perfectly. Did you change anything? And I don't know whether pip uninstall autonomous-learning-library and pip install autonomous-learning-library==0.6.2 can help you.

wanran-cell commented 1 year ago

So strange. You are a nice people and rigorous in research. Thanks for your patience.🙏🙏🤞 I have no GPU, so I changed 'device="cuda" to device="cpu", use cpu to run. Nothing else been changed. I reinstall autonomous-learning-library==0.6.2 and create a new corresponding environment, but both not work, the issue still. In co_pilot_lunlander.py, I can train pilot perfectly. After saved pilot_model.pkl, the copilot training seem no start, which is halted by the issue. The code:exp_co_pilot.train(frames=frames) reported. The issue may show the concatenate agent state and previous weight is mismatch(1×14,8×64). It also exists in alphaVSintervention_swarm.py for me. Is my torch wrong? My torchaudio ==0.8.1 and torchvision==0.9.1,cpu version. image Could you give some help? Much as gracias! Sincerely grateful for you all!

WeihaoTan commented 1 year ago

Hi, I tried to change some settings. You can git pull and see whether it solves your issues. Hope it works.

wanran-cell commented 1 year ago

You are my hero! Wonderful! Mismatch is solved thorough.✨✨ The copilot works perfectly. I deeply appreciate all the help. Thank you, best wishes to you!!!