FreedomIntelligence / AceGPT

Apache License 2.0
111 stars 7 forks source link

Unable to reproduce results from the paper #5

Open gagan3012 opened 10 months ago

gagan3012 commented 10 months ago

Hello, I am unable to reproduce the results from the paper. Can one of the authors please share the evaluation scripts?

hhwer commented 10 months ago

Hello,

Thank you for reaching out regarding the paper's results. We apologize for the inconvenience you've encountered. We are currently working on releasing the relevant materials and evaluation scripts, which should help you reproduce the results successfully.

In the next few days, we plan to make these resources available to the public. In the meantime, it would be helpful to know which specific part of the paper's results you are trying to reproduce. This information will allow us to prioritize the release of the relevant materials that would assist you the most.

Thank you for your patience, and we look forward to helping you with your reproduction efforts.

Best regards,

gagan3012 commented 10 months ago

Hello, I am looking for evaluations on the exams dataset.

gagan3012 commented 10 months ago

Hello, Following up on my request.

jianqing666 commented 10 months ago

Hello,

Thank you for following up. We have noted your request for evaluations on the exams dataset. Currently, there are some issues with our base model on Hugging Face, which might have affected the assessment results related to this dataset.

Our team is actively working on resolving these issues and plans to re-upload the corrected model as soon as possible. In addition, we will update the related evaluation dataset code to ensure the accuracy and reliability of the assessment results.

Please stay tuned to our Hugging Face and GitHub repository for the latest model and code updates. We highly value your feedback and are committed to providing better service.

Thank you for your patience and understanding.

Sincerely,

gagan3012 commented 10 months ago

Hello, I am using the script that is given in the eval folder, but I still can't reproduce the results. The base results match but the chat doesn't match

jianqing666 commented 10 months ago

Hello, We have updated our chat model on Hugging Face and also updated the EXAMS zero-shot evaluation script on our GitHub repository. I kindly recommend that you download the latest versions of both the chat model from Hugging Face and the EXAMS evaluation script from our GitHub. These updates should help in accurately reproducing the results you're seeking.

Sincerely,