Is the code of eval_gpt4_longqa.sh is correct?

princeton-nlp / HELMET

The HELMET Benchmark

https://arxiv.org/abs/2410.02694

MIT License

75 stars 9 forks source link

Closed enze5088 closed 1 month ago

enze5088 commented 1 month ago

The code in it seems to be Python code rather than a shell script.

howard-yen commented 1 month ago

Thanks for catching this, it's been fixed!