Open DaniDelHoyo opened 4 months ago
SPAAN is causing the issue. In lib/spaan/SPAAN/filter.c, you can see a comment indicating that proteins shorter than 50 amino acids are filtered out. Additionally, non-canonical amino acids are incompatible with PSORT.
My suggestion: write a script to sanitize your input FASTA file by removing sequences shorter than 50 amino acids and any sequences containing non-canonical amino acids.
I have been experiencing some errors with the output of VaxignML when there are filtered sequences and I have noticed 2 bugs:
Seems like the not recognized residues are mistaken by a new sequence name at some point?