AI4Bharat / IndicLID

Language Identification for Indian languages
12 stars 4 forks source link

return softmax scores instead of logits #2

Open tfriedel opened 1 year ago

tfriedel commented 1 year ago

I noticed that if scores were calculated with FTR or FTN they tend to be between 0 and 1. Well actually I also saw 1.000041.

The scores by IndicLID-BERT however are raw logits, so could for example be 6.42. This is unfortunate since as a user you would rather have something resembling a probability and you want scores from the different models to be comparable. I suggest you apply a softmax on the logits in IndicLID-BERT and return those.

ammar-wysa commented 1 month ago

Hi @tfriedel, yes, this makes perfect sense! I’m currently working on using this model in an ensemble with others but am having difficulty dealing with the raw logits