LucasAlegre / morl-baselines

Multi-Objective Reinforcement Learning algorithms implementations.
https://lucasalegre.github.io/morl-baselines
MIT License
271 stars 44 forks source link

Does anyone know how to load the trained model? #78

Closed man469 closed 9 months ago

man469 commented 9 months ago

I follow the instruction on this page: https://mo-gymnasium.farama.org/introduction/api/ to run the code, but I got the following error: Envelope.act() missing 1 required positional argument: 'w' Thank you very much if anyone could help!!

ffelten commented 9 months ago

Hi @man469,

Envelope (like GPI-LS or CAPQL) requires a weight vector in its act function. It is because these algorithms encode multiple policies inside a single network. You can see examples in the code (around eval functions): https://github.com/LucasAlegre/morl-baselines/blob/main/morl_baselines/multi_policy/envelope/envelope.py#L378

Depends on what you want to do with it. But the usual workflow is to:

  1. Train multiple policies (train envelope)
  2. Look at the pareto front or resulting policies
  3. Decide on which weight is good for you (e.g. your preferences among the objectives)
  4. Use this weight vector as input to the eval.
man469 commented 9 months ago

Thank you very much for your quick reply!  

樊杰 北京理工大学-机械与车辆学院 电动车辆国家工程研究中心 @.*** Phone:+86 15240294265

 

------------------ 原始邮件 ------------------ 发件人: "Florian @.>; 发送时间: 2023年11月21日(星期二) 下午4:56 收件人: @.>; 抄送: @.>; @.>; 主题: Re: [LucasAlegre/morl-baselines] Does anyone know how to load the trained model? (Issue #78)

Hi @man469,

Envelope (like GPI-LS or CAPQL) requires a weight vector in its act function. It is because these algorithms encode multiple policies inside a single network. You can see examples in the code (around eval functions): https://github.com/LucasAlegre/morl-baselines/blob/main/morl_baselines/multi_policy/envelope/envelope.py#L378

Depends on what you want to do with it. But the usual workflow is to:

Train multiple policies (train envelope)

Look at the pareto front or resulting policies

Decide on which weight is good for you (e.g. your preferences among the objectives)

Use this weight vector as input to the eval.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

man469 commented 9 months ago

Thank you very much for your quick and kind reply!