chz056 / BEAR

22 stars 4 forks source link

Ways of customizing reward function #2

Open lijiayi9712 opened 1 year ago

lijiayi9712 commented 1 year ago

Hi, thanks for developing this framework. I am curious whether there is a tutorial for customizing reward function: from the paper, it seems that the trade-off parameter beta can be easily tuned but is there a way to design reward functions with totally different format? Thanks!!!

chz056 commented 1 year ago

Hi Jiayi,

Thank you very much for your email. Sorry for the inconvenience. With the current version, you can only tune the trade-off parameter beta to change the reward function. If you need to design a reward function with a different format, you need to change line 133 in "build_env.py". However, we will update the framework recently, and adding a more customized reward function module is on our next-step update list.

Sincerely, Chi

On Wed, Jun 7, 2023 at 3:22 PM Jiayi Li @.***> wrote:

Hi, thanks for developing this framework. I am curious whether there is a tutorial for customizing reward function: from the paper, it seems that the trade-off parameter beta can be easily tuned but is there a way to design reward functions with totally different format? Thanks!!!

— Reply to this email directly, view it on GitHub https://urldefense.com/v3/__https://github.com/chz056/BEAR/issues/2__;!!Mih3wA!BXCuNxJOdZNaaFeOd3f_EAiwYzXeOjNtbXEZCKmuWM4JecTm1uDj20ZpPT9il8CMOH2HV1RtV1juKSjm4Siqe8oQWA$, or unsubscribe https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AW35WM5QAGUDZCKLMR4EET3XKD5ITANCNFSM6AAAAAAY6PLRAQ__;!!Mih3wA!BXCuNxJOdZNaaFeOd3f_EAiwYzXeOjNtbXEZCKmuWM4JecTm1uDj20ZpPT9il8CMOH2HV1RtV1juKSjm4Sh7VccVrg$ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

lijiayi9712 commented 1 year ago

Thanks for the info! Do you have an estimated time for this update?

chz056 commented 1 year ago

It should be around mid-July.

Best, Chi

On Thu, Jun 8, 2023 at 11:21 AM Jiayi Li @.***> wrote:

Thanks for the info! Do you have an estimated time for this update?

— Reply to this email directly, view it on GitHub https://urldefense.com/v3/__https://github.com/chz056/BEAR/issues/2*issuecomment-1583129823__;Iw!!Mih3wA!HakHLsWKr1CFDGv4nk0NUDWtn4OuTyId-K1VlQBwH_29ABTIRvPG3PbYmW-QzF1dlkxdUakwE73RnlU1jlu79FG8PQ$, or unsubscribe https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AW35WM4C2QY5LTNJFOSGY3DXKIJY7ANCNFSM6AAAAAAY6PLRAQ__;!!Mih3wA!HakHLsWKr1CFDGv4nk0NUDWtn4OuTyId-K1VlQBwH_29ABTIRvPG3PbYmW-QzF1dlkxdUakwE73RnlU1jlvQ4FitAQ$ . You are receiving this because you commented.Message ID: @.***>

chennnnnyize commented 7 months ago

Hi @chz056 and @lijiayi9712, shall we close this issue? Looks like my_custom_reward_function() in customize folder addresses the issue.