eureka-research / Eureka

Official Repository for "Eureka: Human-Level Reward Design via Coding Large Language Models" (ICLR 2024)
https://eureka-research.github.io/
MIT License
2.73k stars 244 forks source link

Adding code run failure handling process & Update to align with the newest OpenAI package formats #34

Open cylqqqcyl opened 7 months ago

cylqqqcyl commented 7 months ago

Description

Adding code run failure handling process:

This is done by query back to the API with the exception message and reward function code. There is a new parameter added in the config.yaml called max_retries that limits the attempts to fix the buggy reward code and prevents infinite loop. The current code can now at least generate 1 legit reward functions and, in most cases, generate all the num_samples to be executable.

Update to align with the newest OpenAI package formats:

According to https://github.com/openai/openai-python/discussions/742 OpenAI has released a new major version of its SDK, and they recommend upgrading promptly. I reimplemented the queries and handling of responses to align with the new requirements.

Type of change

Related Issues

https://github.com/eureka-research/Eureka/issues/4

https://github.com/eureka-research/Eureka/issues/4#issuecomment-1774193819

Specific changes

Screenshots

Example of failed run and automatic code fix:

samples