Alignment algorithms compare two sequences and produce a distance metric, "score", based on how similar they are.
This score usually accounts for characters that match, characters that don't, and any gaps between matching characters.
In the case of nucleotides matching and not matching can be a simple +1, -1 for anything that matches or mismatches respectively but in the case of protein sequences these weights can (and probably should) vary depending on the chemical similarities between groups of amino acids.
I know of the blosum substitution matrices and know that biogo has a few along with NeedlemanWunsch and SmithWaterman implementations. My thought would be to cite and use at least the matrices they provide and maybe implement our own alignment algos that would be easier to maintain longterm.
Alignment algorithms compare two sequences and produce a distance metric, "score", based on how similar they are.
This score usually accounts for characters that match, characters that don't, and any gaps between matching characters.
In the case of nucleotides matching and not matching can be a simple +1, -1 for anything that matches or mismatches respectively but in the case of protein sequences these weights can (and probably should) vary depending on the chemical similarities between groups of amino acids.
I know of the blosum substitution matrices and know that biogo has a few along with NeedlemanWunsch and SmithWaterman implementations. My thought would be to cite and use at least the matrices they provide and maybe implement our own alignment algos that would be easier to maintain longterm.