Martinsos / edlib

Lightweight, super fast C/C++ (& Python) library for sequence alignment using edit (Levenshtein) distance.
http://martinsos.github.io/edlib
MIT License
492 stars 162 forks source link

Simple edit (Hamming) distance #186

Closed remz1337 closed 2 years ago

remz1337 commented 2 years ago

I would like to know if it is possible to compute the Hamming distance using this library?

I have 2 sequence of 30 characters, and for every position the characters differ, so I would like to have the resulting distance = 30

UAUUGAUCAUUUGAAUCGGGUAGGCAAUAU CCCGAUCACACAAGUCAAAACUUCUCGCGC

Not related: this library is also available on vcpkg, would be nice to mention it in the readme :)

Martinsos commented 2 years ago

I just did a quick read on Hamming distance and if I got it right, it is trivial to calculate, correct? It is just going through the chars on same positions in both strings and checking if they are the same or not?

To answer the question: no edlib can't do this, it does only levehnstein distance, but as I mentioned above, Hamming distance seems to be trivial to calculate and it can be done in couple lines of code with linear complexity (unless I got something wrong).

vcpkg -> cool I didn't know :)! I am not much in C++ ecosystem these days so I am out of date somewhat. Would you be interesting in creating a PR that modifies README to mention that edlib is on vcpkg, with any additional relevant information mentioned?

Martinsos commented 2 years ago

I am closing this for now but feel free to comment on it and we can reopen if needed.

remz1337 commented 2 years ago

Hey, yea no worries, it's rather easy to implement but I always prefer to leverage tools. Sure thing, give me a minute I'll send a PR for the Readme