LIAAD / yake

Single-document unsupervised keyword extraction
https://liaad.github.io/yake
Other
1.64k stars 227 forks source link

levenshtein distance space complexity #80

Open adamgrzelak opened 1 year ago

adamgrzelak commented 1 year ago

Hello,

First of all I would like to say thank you for developing and sharing this tool with the community. I recently used it for a project and I think it works great.

I have been going through the code out of curiosity and I would like to propose an improvement (in my opinion).

Calculation of Levenshtein distance can be implemented in a way that reduces space complexity, since only the previous row of the distance matrix actually has to be stored in memory. Based on simple tests, I estimate that this implementation reduces the time required this particular task by more than half.

Source with example: https://en.wikibooks.org/wiki/Algorithm_Implementation/Strings/Levenshtein_distance#Python

Please let me know if that makes sense to you, and if yes, I will open a PR.

Kind regards, Adam Grzelak