noahshinn / reflexion

[NeurIPS 2023] Reflexion: Language Agents with Verbal Reinforcement Learning
MIT License
2.16k stars 211 forks source link

Using CodeLLAMA cause the program crash #39

Open allanj opened 3 months ago

allanj commented 3 months ago

I was trying the script with CodeLLaMA

Sometimes I found the script just got killed, without showing any error. Any intuition? image

allanj commented 3 months ago

I just managed to figure out a way similar to PAL (https://github.com/reasoning-machines/pal) without using a thread.

But I'm not sure if this is an appropriate way, I can make a PR, if this is acceptable.

import copy
import datetime
from typing import Any, Dict

class GenericRuntime:
    GLOBAL_DICT = {}
    LOCAL_DICT = None
    HEADERS = []
    def __init__(self):
        self._global_vars = copy.copy(self.GLOBAL_DICT)
        self._local_vars = copy.copy(self.LOCAL_DICT) if self.LOCAL_DICT else None

        for c in self.HEADERS:
            self.exec_code(c)

    def exec_code(self, code_piece: str) -> None:
        exec(code_piece, self._global_vars)

    def clear(self):
        self._global_vars = {}
        self._local_vars = None

import traceback

def run_code(runtime, code_gen: str, time_out:float = 10):
    snippet = code_gen.split('\n')
    try:
        with timeout(time_out):
            runtime.exec_code(code_gen)
            print("success")
            return snippet, None  # No error, so return None for the error message
    except Exception as e:
        error_message = str(e)  # Capture the error message
        # Optionally, you can use traceback to get a more detailed error message
        # error_message = traceback.format_exc()
        print("error:", error_message)
        return snippet, error_message  # Return both the code snippet and the error 

func = "def add(a, b):\n    while True:\n        x === 1\n    return a + b\nassert add(1, 2) == 3"
run_code(runtime=GenericRuntime(), code_gen=func, time_out=1)

This code snippet should be able to run, even though "while True" run infinitely.

Using a thread seems does not really stop the main program.