eureka-research / Eureka

Official Repository for "Eureka: Human-Level Reward Design via Coding Large Language Models" (ICLR 2024)
https://eureka-research.github.io/
MIT License
2.73k stars 244 forks source link

questions of the performance bar based on IsaacGym #39

Open ChangLee0903 opened 3 months ago

ChangLee0903 commented 3 months ago

Dear Eureka project members,

I really appreciate the Eureka that shows the ability to redefine a better reward function via LLM. For your bar result, I got two questions:

  1. How do you handle the result that a sparse scheme is better than taking human design? I ran the experiment on the Quadcopter task. Without considering the survival condition, the sparse scheme is much better than the human design (In your bar result, it seems to have different result from mine).

    image
  2. Is the number of FrankaCabinet's results a typo? It seems that the bar that gets 12 times is the same high as the bar that gets 2 times.

    image

Also, there are some tasks that will have an impulse in success performance when their survival conditions are unstable, and it might cause an unfair comparison, such as the FrankaCabinet and Anymal tasks. Perhaps it would be better to count survival as a part of performance?

best, Chi-Chang Lee