nickduran / align-linguistic-alignment

Python library for extracting quantitative, reproducible metrics of multi-level alignment between two speakers in naturalistic language corpora.
MIT License
40 stars 12 forks source link

Change lexical alignment default to 1-grams #26

Open fusaroli opened 6 years ago

fusaroli commented 6 years ago

No article in the current literature uses n-grams above 1 for lexical alignment. It'd make sense to change the default to 1.

fusaroli commented 6 years ago

The fix seems pretty straightforward: in the calculate alignment.py, function LexicalPOSAlignment, the line for ngram in range(2,maxngram+1):

should become

for ngram in range(1,maxngram+1):

This however has the potential drawback of providing the user with an additional meaningless syntactic alignment of 1-grams. No biggie for me, but if we want to avoid that, it could be solved by not passing along the penn_tok1 and penn_lem1 (including the stan stuff).