machinalis / yalign

A sentence aligner for comparable corpora
Other
127 stars 31 forks source link

Set WordPairScore to prefer maximum scoring pairs (rather than random). #2

Closed DrDub closed 10 years ago

DrDub commented 10 years ago

Before, WordPairScore will take the last value of the potential translation, in the event of clashes. This didn't seem correct.

For example (using the dictionary.csv from the tutorial):

He abstained from any further comments. Se abstuvo de hacer mas comentarios.

The words 'abstained' and 'any' both can map to 'se', but 'abstained' score is 0.0138 while 'any' is 0.0015. The current code will return the smallest value because 'any' appears later in the sentence.

This commit fixes this issue, by updating the values to keep the maximum score registered within the sentence.

rafacarrascosa commented 10 years ago

Thank you very much Pablo and sorry for the delay!

DrDub commented 10 years ago

So... did you merge it?

rafacarrascosa commented 10 years ago

Wtf, I could swear I had merged it :S Maybe it was the "confirm button" or the "confirm that you confirm button" or the "confirm that you confirm that you confirm button". It's not like we are ejecting or something Sorry Pablo