Closed NoAtmosphere0 closed 1 month ago
Hi @NoAtmosphere0!
I believe the easiest way to achieve this would be by fine-tuning one of the GoLLIE checkpoints with a Vietnamese dataset. Both Wikiann and Polyglot NER seem like the best candidates since they use the same labels as CoNLL03. To fine-tune your model with either of these datasets, you should:
codellama/CodeLlama-7b-hf
to HiTZ/GoLLIE-7B
.A significant concern here is the proficiency of LLaMA2/CodeLLaMA in Vietnamese. The model might not be very adept for that particular language, and unfortunately, there's a limited selection of multilingual LLMs available.
Hi @ikergarcia1996!
Thank you for your prompt response and helpful instructions. We will follow the steps that you have outlined in your response to train GoLLIE and also keep in mind your concerns about the proficiency of LLaMA2/CodeLLaMA in Vietnamese.
We will keep you updated on our progress by not closing this issue and let you know if we have any questions or need any further assistance. Thanks again for your support!
@NoAtmosphere0 Did you had any progress on that?
Hi GoLLIE research team, I am currently in a group of Vietnamese university students who want to present your paper for an upcoming seminar in our "Introduction to Natural Language Processing" course. Our task is to summarize and explain the contents of your paper to our fellow students and lecturers.
To make it easier to understand for our classmates, we are interested in training GoLLIE using Vietnamese datasets. If it's possible, we would greatly appreciate it if you could provide us with some instructions on how to proceed with this. We sincerely enjoyed reading your paper and believe that it would greatly benefit our presentation.
Here are some datasets for the named-entity-recognition subtask that I found on Hugging Face:
We would be extremely grateful if you could provide us with any guidance or assistance on our endeavor. Please feel free to reach out if you have any questions or require more information from us. We are more than willing to cooperate to make this collaboration successful.