defog-ai / sql-eval

Evaluate the accuracy of LLM generated outputs
Apache License 2.0
485 stars 52 forks source link

Continuous eval script #113

Closed wongjingping closed 3 months ago

wongjingping commented 3 months ago

Add script for continuously pulling down newly saved models and evaluating them with the vllm runner. You can run it in the background (via nohup/screen/tmux) with python3 gcs_eval.py, and it will continuously run this loop (checks every 10s):

Some things that don't work: