gcs eval for checkpoint weights

Added a script like gcs_eval.py but for working with nested folders of weights, for example:

models/
├── model1/
│   ├── checkpoint-500/ # contains the safetensors
│   └── checkpoint-1000/
└── model2/
    ├── checkpoint-500/
    └── checkpoint-1000/

We keep the original gcs_eval.py around for users who want to work with non-nested folders of weights, for example:

models/
├── model1/ # contains the safetensors
└── model2/

For context, you can run this script python3 gcs_eval.py and it will continuously pull models from gcs, eval it (outputting the results/csv locally), and then shift the evaluated model into a different directory on gcs.

defog-ai / sql-eval

gcs eval for checkpoint weights #132