ztane / python-Levenshtein

The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarity
GNU General Public License v2.0
1.26k stars 155 forks source link

Compute ratios for all strings in list (feature request) #17

Open mscuthbert opened 9 years ago

mscuthbert commented 9 years ago

Hello, I'm a very poor c programmer, so unfortunately, this request is a bit beyond me, but one of the things that would be extremely helpful is to be able to give a list of sequences/strings and to return a list of lists of ratios between each pair of ratios (obviously, it could be done so that sequence 1 has a list of n-1, sequence 2 has a list of length n-2, etc. -- how it's done doesn't really matter).

For my purposes (computing ratios among a whole slew of sequences) I've measured the speed limiting factor as the transfer of data to and from the c-code, so I believe that transferring all the data once would give an enormous speed increase. Even transferring a list of sequences to get ratios against one string would be a big speedup I believe.

I'd be willing to contribute some of the other python tasks (test suite; pure-python fallback implementation) in return for the developers' time on this. Thanks!