andywang-25 / Llama2-HPO-Normalization

Fine-tuning LLaMA 2 for rare disease concept normalization
3 stars 2 forks source link

Evaluation and training code availability #2

Open mwiewior opened 2 months ago

mwiewior commented 2 months ago

Hey @andywang-25 - really great work! We analyzed the results and they look really impressive - we would like to fine-tune other open-source models as well as try to benchmark some alternative approaches using your test cases - was it possible that you also publish the code you use for fine-tuning Llama-2 as well as the code you use for generating results available in the table 1 of your manuscript (I only found the code for the lucene part)? I would really appreciate so that we could more precisely verify the performance and how does it compare to the fine tuned llama-2 you released.

Many thanks, Marek

andywang-25 commented 2 months ago

Hi Marek,

I apologize for my late response.

On Github, I have published the code I have used for fine-tuning Llama-2 in the file "Training_code.ipynb." This file includes steps to generate the training and testing data sets.

Additionally, you can follow the code in "HPO_Model_Github.ipynb" to produce results with the testing data. This can be done by uploading the testing dataset of choice (synonyms, single typos, complex typos).

Please let me know if you have any questions and I would be happy to help, Andy

On Mon, Jul 29, 2024 at 2:56 AM Marek Wiewiórka @.***> wrote:

Hey @andywang-25 https://github.com/andywang-25 - really great work! We analyzed the results and they look really impressive - we would like to fine-tune other open-source models as well as try to benchmark some alternative approaches using your test cases - was it possible that you also publish the code you use for fine-tuning Llama-2 as well as the code you use for generating results available in the table 1 of your manuscript (I only found the code for the lucene part)? I would really appreciate so that we could more precisely verify the performance and how does it compare to the fine tuned llama-2 you released.

Many thanks, Marek

— Reply to this email directly, view it on GitHub https://github.com/andywang-25/Llama2-HPO-Normalization/issues/2, or unsubscribe https://github.com/notifications/unsubscribe-auth/A25JICCOAN5FCUI2P7QQWXLZOXRRFAVCNFSM6AAAAABLTUET46VHI2DSMVQWIX3LMV43ASLTON2WKOZSGQZTINRQG4YTCOA . You are receiving this because you were mentioned.Message ID: @.***>

mwiewior commented 2 months ago

Hi Andy thank you, very very much ! I will take a look but it seems that's what I was looking for. Thanks again, Marek

mwiewior commented 2 months ago

@andywang-25 - one minor thing - I checked all the files and it seems that the file test_synonyms.json is empty. Could please take a look a it?

Thanks, Marek

andywang-25 commented 2 months ago

Hi Marek,

Thank you for pointing that out. I've just uploaded a new copy of the test_synonyms.json file. Additionally, I've modified the Training_code.ipynb very slightly to produce the same data files I uploaded onto Github.

Best, Andy

On Wed, Aug 7, 2024 at 12:30 AM Marek Wiewiórka @.***> wrote:

@andywang-25 https://github.com/andywang-25 - one minor thing - I checked all the files and it seems that the file test_synonyms.json is empty. Could please take a look a it?

Thanks, Marek

— Reply to this email directly, view it on GitHub https://github.com/andywang-25/Llama2-HPO-Normalization/issues/2#issuecomment-2272596065, or unsubscribe https://github.com/notifications/unsubscribe-auth/A25JICHN5W6YA3UQPMM4BTTZQGPHVAVCNFSM6AAAAABLTUET46VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENZSGU4TMMBWGU . You are receiving this because you were mentioned.Message ID: @.***>

mwiewior commented 2 months ago

Hi Andy - thank you ! I've checked and everything is perfect now.

Thanks, Marek

andywang-25 commented 2 months ago

Glad to hear it!

On Fri, Aug 9, 2024 at 5:36 AM Marek Wiewiórka @.***> wrote:

Hi Andy - thank you ! I've checked and everything is perfect now.

Thanks, Marek

— Reply to this email directly, view it on GitHub https://github.com/andywang-25/Llama2-HPO-Normalization/issues/2#issuecomment-2277552918, or unsubscribe https://github.com/notifications/unsubscribe-auth/A25JICD7BNHZJV2VCTFCO53ZQSES7AVCNFSM6AAAAABLTUET46VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENZXGU2TEOJRHA . You are receiving this because you were mentioned.Message ID: @.***>