defog-ai / sql-eval

Evaluate the accuracy of LLM generated outputs
Apache License 2.0
485 stars 52 forks source link

gcs eval for checkpoint weights #132

Closed wongjingping closed 2 months ago

wongjingping commented 2 months ago

Added a script like gcs_eval.py but for working with nested folders of weights, for example:

models/
├── model1/
│   ├── checkpoint-500/ # contains the safetensors
│   └── checkpoint-1000/
└── model2/
    ├── checkpoint-500/
    └── checkpoint-1000/

We keep the original gcs_eval.py around for users who want to work with non-nested folders of weights, for example:

models/
├── model1/ # contains the safetensors
└── model2/

For context, you can run this script python3 gcs_eval.py and it will continuously pull models from gcs, eval it (outputting the results/csv locally), and then shift the evaluated model into a different directory on gcs.

rishsriv commented 2 months ago

Nice, super convenient - thank you!