evolbioinfo / goalign

Goalign is a set of command line tools and an API to manipulate multiple sequence alignments. It is implemented in Go language.
GNU General Public License v2.0
71 stars 8 forks source link

goalign detecting amino acid file as nucleotide file #14

Closed frikinzi closed 10 months ago

frikinzi commented 10 months ago

Thank you for developing this helpful tool.

I'm running into an issue with goalign compute where the tool is automatically detecting my file as a nucleotide file even though there are amino acid characters in it (S, Y, T). Below I include a snippet of my .faa file.

>GCA_0
SCYTGAK
>GCA_1
SAYSASK
>GCA_2
SAYSASK
>GCA_3
SAYSASK

The command I ran is: goalign compute pssm -i N85.faa > N85.count.tsv

The output of that command is this, when I would expect there to be amino acid characters.

Screenshot 2023-11-13 at 4 44 55 PM

I tried including --auto-detect, but it didn't fix my problem.

I installed goalign through conda.

Any help would be appreciated. Please let me know if I can provide any other information. Thank you!

fredericlemoine commented 10 months ago

Hi, Thank you for your suggestion. I've been thinking of adding this feature for a while. I've just added the option "--alphabet" to goalign commands. It allows to specify the alphabet if you don't want the one that is automatically detected (because of common IUPAC characters). If you have a way to test it, it would be great. Thanks

frikinzi commented 10 months ago

I just tested it and it worked! Thanks so much for the quick response and fix.