Samples lost due to error when sampling

When an error happens during sampling LLMs, all samples retrieved before are lost. This might cause substantial costs, e.g. when working with commercial LLMs. We might build in some error-handling, e.g. in this notebook. We could for example build a repeat-n-times scheme, e.g. like this (untested code):

def generate_one_completion_blablador_mistral(input_code):
    n_times_until_no_error = 3
    for _ in range(n_times_until_no_error):
        try:
            import openai
            import os

            client = openai.OpenAI()
            client.base_url = 'https://helmholtz-blablador.fz-juelich.de:8000/v1'
            client.api_key = os.environ.get('BLABLADOR_API_KEY')
            response = client.chat.completions.create(
                model=model_blablador_mistral,
                messages=[{"role": "user", "content": setup_prompt(input_code)}],
            )
            return response.choices[0].message.content.strip()
        except:
            import time
            time.sleep(10000) # wait 10 seconds before trying again
            continue # try  again in case of error

Alternatively, we need to find a way to store intermediate jsonl files. This would require a modification within the HumanEval framework, which I don't know very well.

haesleinhuepf / human-eval-bia

Samples lost due to error when sampling #61