OSU-NLP-Group / TravelPlanner

[ICML'24 Spotlight] "TravelPlanner: A Benchmark for Real-World Planning with Language Agents"
https://osu-nlp-group.github.io/TravelPlanner/
MIT License
215 stars 27 forks source link

Request for evaluation code examples #8

Closed bamos closed 6 months ago

bamos commented 6 months ago

I was able to run the provided command from the README to generate the directory of outputs:

python sole_planning.py  --set_type $SET_TYPE --output_dir $OUTPUT_DIR --model_name $MODEL_NAME --strategy $STRATEGY

But, I don't immediately see how to:

  1. run the evaluation code on these generations (it seems like eval_score is never called or referenced in the code in this repo), and
  2. convert these into the correct single-file format with the queries necessary for the website submission

When you get a chance, can you please also provide the code/examples for doing these? I think it'll make the barrier to getting starting here even lower

hsaest commented 6 months ago

Hi Brandon,

Thank you for your interest in our work.

  1. We will release the evaluation scripts later (not later than tomorrow). For evaluation example, we have provided one in huggingface. We will also add it to this repo. Thank you for your reminder.
  2. As mentioned here, the parsing and extraction code will be released tomorrow. Please look forward to it.

Thank you again. Please feel free to contact us if you have further questions.

Best, Jian

hsaest commented 6 months ago

Hi @bamos,

We have released the post-process code for parsing and extraction and provided the evaluation examples. Please refer to our newest code.

Best, Jian

bamos commented 6 months ago

That's great, thank you!