Open Aciole-David opened 2 weeks ago
Hi David,
Historically, the Pearson FASTA format used an optional asterisk as an end marker on each sequence record, like:
> my-sequence
ATTAAAGGTTTATACCTTCCCAGGTAACAAACC
GGTTTATACCTTCCCA*
The last asterisk is not part of the sequence itself and reading the sequence stops at that point. I can't remove this behaviour without breaking older inputs. However, I can skip over internal asterisks as in your example.
The branch https://github.com/desmid/mview/tree/issue-23-handle-asterisks-in-pearson-format makes the change, but notice that if one of your sequences ever actually ends with an asterisk, that will look like the end-of-sequence marker and will be stripped.
Nigel
Hello! I'd like to kindly ask if is it possible to add support to inputs baring asterisk "*" character. This one I tested was made with sam2fasta.py:
raw
$ cat minitestsam2fastapy.fasta
raw fails
$ mview -in fasta minitestsam2fastapy.fasta
edited
$ cat minitestsam2fastapy-EDIT.fasta
edited works
$ mview -in fasta minitestsam2fastapy-EDIT.fasta