IDSIA / hhmarl_2D

Heterogeneous Hierarchical Multi Agent Reinforcement Learning for Air Combat
59 stars 12 forks source link

FileExistsError: [WinError 183] 当文件已存在时,无法创建该文件。: 'policies/model.pt' -> 'policies/Esc_AC1.pt' #4

Closed YangRongtai closed 6 months ago

YangRongtai commented 6 months ago

Why did this error occur while training the low-level policy?

Traceback (most recent call last): File "D:\Py_project\hhmarl_2D-main\hhmarl_2D-main\train_hetero.py", line 292, in make_checkpoint(args, algo, log_dir, i, args.level, test_env) File "D:\Py_project\hhmarl_2D-main\hhmarl_2D-main\train_hetero.py", line 105, in make_checkpoint os.rename('policies/model.pt', f'policies/{policy_name}.pt') FileExistsError: [WinError 183] 当文件已存在时,无法创建该文件。: 'policies/model.pt' -> 'policies/Esc_AC1.pt'

ardian-selmonaj commented 6 months ago

Should be a windows error as it looks. I never encountered this problem. However, have a look at policy_export.py which I uploaded. You can comment out this section in make_checkpoint where you get the error and manually export policies after training has finished.

YangRongtai commented 6 months ago

Ok, I added a line in front of this line of code and the error is gone, like this:

 os.remove(f'policies/{policy_name}.pt')  # 删除已存在的目标文件  os.rename('policies/model.pt',f'policies/{policy_name}.pt')

but there should be no problem with model storage, right? 

In addition, I want to compare other algorithms other than the PPO algorithm on the basis of your code, I have tried a little, but it has not succeeded, please ask what should be paid attention to in the modification of the replacement algorithm on your code, please give some guidance and suggestions, and look forward to your reply.

---原始邮件--- 发件人: @.> 发送时间: 2024年3月14日(周四) 晚上6:08 收件人: @.>; 抄送: @.**@.>; 主题: Re: [IDSIA/hhmarl_2D] FileExistsError: [WinError 183] 当文件已存在时,无法创建该文件。: 'policies/model.pt' -> 'policies/Esc_AC1.pt' (Issue #4)

Should be a windows error as it looks. I never encountered this problem. However, have a look at policy_export.py which I uploaded. You can comment out this section in make_checkpoint where you get the error and manually export policies after training has finished.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

ardian-selmonaj commented 6 months ago

Ok, I added a line in front of this line of code and the error is gone, like this: ---------------------------------------------  os.remove(f'policies/{policy_name}.pt')  # 删除已存在的目标文件  os.rename('policies/model.pt',f'policies/{policy_name}.pt') ---------------------------------------------- but there should be no problem with model storage, right?  In addition, I want to compare other algorithms other than the PPO algorithm on the basis of your code, I have tried a little, but it has not succeeded, please ask what should be paid attention to in the modification of the replacement algorithm on your code, please give some guidance and suggestions, and look forward to your reply. ---原始邮件--- 发件人: @.> 发送时间: 2024年3月14日(周四) 晚上6:08 收件人: @.>; 抄送: @.**@.>; 主题: Re: [IDSIA/hhmarl_2D] FileExistsError: [WinError 183] 当文件已存在时,无法创建该文件。: 'policies/model.pt' -> 'policies/Esc_AC1.pt' (Issue #4) Should be a windows error as it looks. I never encountered this problem. However, have a look at policy_export.py which I uploaded. You can comment out this section in make_checkpoint where you get the error and manually export policies after training has finished. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

From level=4 onwards, you will need the learned policies in folder policies. So, if they are not stored during training (since you commented out), you will need to call policy_export.py after training to have these policies. Just read the description of the project.

Regarding different algorithms than PPO, you have to look at Ray RLlib for clear advice how the different algorithms behave. I have to admit, it is not straightforward and needs some adjustments depending on the algorithm. For example, DQN supports multi-agent, but not RNN and Attention, which is used in my model. That's why Ray might be a bit restrictive.