hitz-zentroa / GoLLIE

Guideline following Large Language Model for Information Extraction
https://hitz-zentroa.github.io/GoLLIE/
Apache License 2.0
263 stars 18 forks source link

Question about code #16

Closed TuRan-sino closed 2 months ago

TuRan-sino commented 3 months ago

This work is a very impressive, but i have some question about your code. In the line 158 - 171 of src/dataset/dataset.py Can these lines of code be transformed into formula below

image

I don't understand the purpose of this formula

ikergarcia1996 commented 3 months ago

Hi @TuRan-sino

The code allows to set of a percentage of the total loss for the prompt tokens and the result tokens (the tokens that the model must generate) during training. If you set the parameter prompt_loss_weight: 0.05, the prompt tokens will contribute 5% to the total loss, while the result tokens will contribute 95%. This formula calculates the weight that should be assigned to each prompt token so that they account for 5% of the total loss.

We conducted some tests with this parameter during the development of GoLLIE and found that setting the weight of the loss tokens to 0 yields the best results. That's why every configuration in GoLLIE/configs has prompt_loss_weight set to 0. Therefore, the result of this formula will always be 0. However, we have retained this parameter in the code for future experiments.

TuRan-sino commented 3 months ago

Thanks for your reply