InternLM / Agent-FLAN

[ACL2024 Findings] Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models
https://internlm.github.io/Agent-FLAN/
Apache License 2.0
316 stars 9 forks source link

Question about evaluation datasets #14

Open JasonZhu1313 opened 3 months ago

JasonZhu1313 commented 3 months ago

Hey,

Great observations and work on disentangling the format following from reasoning! Could we share details on evaluation dataset we used and how we can reproduce the result in the paper? I have fine tuned llama3 on the dataset and achieved worse performance in 30 questions curated from HotpotQA dataset. If you could share some light on this it would be super appreciated! Thanks, Jason