Open fusaroli opened 6 years ago
The fix seems pretty straightforward: in the calculate alignment.py, function LexicalPOSAlignment, the line
for ngram in range(2,maxngram+1):
should become
for ngram in range(1,maxngram+1):
This however has the potential drawback of providing the user with an additional meaningless syntactic alignment of 1-grams. No biggie for me, but if we want to avoid that, it could be solved by not passing along the penn_tok1 and penn_lem1 (including the stan stuff).
No article in the current literature uses n-grams above 1 for lexical alignment. It'd make sense to change the default to 1.