Closed coinflip112 closed 5 months ago
Yes theoretically it is possible but the kdtree package currently does not allow strings. I can maybe use [u8] to represent strings, but the Levenshtein distances in this package is based on RapidFuzz's implementation, which is based on chars instead of [u8] so that it is correct even in other languages like Russian. In short, to do that is possible but requires way too much hacking right now...
Makes sense. Haven't worked in rust before but it's on my "would like to but don't have time for list". If it ever does I could take a crack at it 😅
I am closing this issue since I won't be implementing it for now. But it is something on my mind and feel free to pin me if you find something that can facilitate the implementation. Thanks!
An inefficient implementation has been added. Please refer to the examples. I believe it is available >= 0.2.3
Nice! Thanks! Tested and works exactly as I'd expect :) 🙏
It would be extremely useful to use
knn_ptwise
on a string column (with string distance metrics). This should be feasible in principle, right? Levenshtein distance for example defines a metric space. Does KDTree allow for different distances?