MastafaF / multilingual_similarity_compare

Other
5 stars 0 forks source link

What metric is tracked in the table? #1

Open timpal0l opened 4 years ago

timpal0l commented 4 years ago

The lower the better I assume, is it the error rate your tracking?

MastafaF commented 4 years ago

Hi @timpal0l ,

The lower the better indeed. We are tracking the error rate for each pairs of languages.

For curious readers: A document describes a given language. There are as many documents as languages. We iterate over each document sentence by sentence. What we are comparing in the tables is how similar are the documents according to each multilingual architecture. If the model was perfect, the error rates in the tables from the README would be equal to 0. Indeed, for each sentence of a document d_i, it would find the translation in the document d_k for all k != i and hence we would never increment the error rate.

Regards, Mastafa