brucewlee / lftk

[BEA @ ACL 2023] General-purpose tool for linguistic features extraction; Tested on readability assessment, essay scoring, fake news detection, hate speech detection, etc.
https://lftk.rtfd.io/
Other
113 stars 22 forks source link

bug in the `total_number_of_unique_words_no_lemma` function #28

Open chaojiang06 opened 7 months ago

chaojiang06 commented 7 months ago

Hi, dear author, thank you for your awesome work! I want to bring your attention to a possible bug in the implementation of total_number_of_unique_words_no_lemma function

In lines 194-195 of file lftk/foundation/wordsent.py, the total_number_of_unique_words_no_lemma function still does the lemma operation, even though no lemma is specified. This makes the "total_number_of_unique_words_no_lemma" function almost identical to total_number_of_unique_words.

This will lead to an error that corr_ttr is same as corr_ttr_no_lem

brucewlee commented 6 months ago

i'll look into this