Added notebook for uploading vllm eval results to wandb post-hoc.
This notebook will take the results produced by run_checkpoints.sh and run_checkpoints_cot.sh and upload them to the specified run id on wandb.
There is 1 weird quirk about wandb where they don't allow you to log to past steps once completed (see forum post). To work around that, we just continue logging from the latest step onwards (which will cause our steps displayed on wandb to be inconsistent with the actual checkpoints, but should be sufficient for comparison reasons).
For example, you can see that the eval for each step starts after 1k steps in the screenshot below:
Added notebook for uploading vllm eval results to wandb post-hoc. This notebook will take the results produced by
run_checkpoints.sh
andrun_checkpoints_cot.sh
and upload them to the specified run id on wandb. There is 1 weird quirk about wandb where they don't allow you to log to past steps once completed (see forum post). To work around that, we just continue logging from the latest step onwards (which will cause our steps displayed on wandb to be inconsistent with the actual checkpoints, but should be sufficient for comparison reasons). For example, you can see that the eval for each step starts after 1k steps in the screenshot below: