Continuous eval script - Githubissues

Add script for continuously pulling down newly saved models and evaluating them with the vllm runner. You can run it in the background (via nohup/screen/tmux) with python3 gcs_eval.py, and it will continuously run this loop (checks every 10s):

find new models that have been saved in GCS_MODEL_DIR
download each new model into LOCAL_MODEL_DIR
evaluate the new model and output the relevant csv file
move the model from GCS_MODEL_DIR to GCS_MODEL_EVAL_DIR

Some things that don't work:

calling run_vllm_eval directly within the loop as initializing the LLM with vllm will call ray.init(), which can only be called once per script, and will hang on subsequent calls.

defog-ai / sql-eval

Continuous eval script #113