Open gagan3012 opened 1 year ago
Hello,
Thank you for reaching out regarding the paper's results. We apologize for the inconvenience you've encountered. We are currently working on releasing the relevant materials and evaluation scripts, which should help you reproduce the results successfully.
In the next few days, we plan to make these resources available to the public. In the meantime, it would be helpful to know which specific part of the paper's results you are trying to reproduce. This information will allow us to prioritize the release of the relevant materials that would assist you the most.
Thank you for your patience, and we look forward to helping you with your reproduction efforts.
Best regards,
Hello, I am looking for evaluations on the exams dataset.
Hello, Following up on my request.
Hello,
Thank you for following up. We have noted your request for evaluations on the exams dataset. Currently, there are some issues with our base model on Hugging Face, which might have affected the assessment results related to this dataset.
Our team is actively working on resolving these issues and plans to re-upload the corrected model as soon as possible. In addition, we will update the related evaluation dataset code to ensure the accuracy and reliability of the assessment results.
Please stay tuned to our Hugging Face and GitHub repository for the latest model and code updates. We highly value your feedback and are committed to providing better service.
Thank you for your patience and understanding.
Sincerely,
Hello, I am using the script that is given in the eval folder, but I still can't reproduce the results. The base results match but the chat doesn't match
Hello, We have updated our chat model on Hugging Face and also updated the EXAMS zero-shot evaluation script on our GitHub repository. I kindly recommend that you download the latest versions of both the chat model from Hugging Face and the EXAMS evaluation script from our GitHub. These updates should help in accurately reproducing the results you're seeking.
Sincerely,
Hello, I am unable to reproduce the results from the paper. Can one of the authors please share the evaluation scripts?