eureka-research / Eureka

Official Repository for "Eureka: Human-Level Reward Design via Coding Large Language Models" (ICLR 2024)
https://eureka-research.github.io/
MIT License
2.73k stars 244 forks source link

Application in different environment and reward type #47

Open edatsika opened 6 days ago

edatsika commented 6 days ago

Can this reward design algorithm be applied considering a different application setup? For instance, could it be used in a custom environment that simulates a network with nodes and users and an RL algorithm is used to perform some kind of network optimization? In that case, the environment code should be fed to the LLM agent and natural language should be used to describe the task. Any intuition on how this could be implemented?