ekg / seqwish

alignment to variation graph inducer
MIT License
143 stars 18 forks source link

Clarify behaviour with softmasked sequence #55

Closed glennhickey closed 4 years ago

glennhickey commented 4 years ago

seqwish seems to ignore alignments in the input PAF that contain softmasked/lowercase sequence in the fasta.

I humbly suggest that this be turned off by default, and have it activated by command line option similarly to the other filters.

Or, if leaving it default, just a mention of how it works in the README would be very helpful.

ekg commented 4 years ago

Yes, sorry about this. It would be an easy thing to patch by upper casing the input sequences. It's even caught me up before. I was typically preprocessing the FASTA inputs to resolve this, but that is tedious.

On Thu, Jun 25, 2020, 19:58 Glenn Hickey notifications@github.com wrote:

seqwish seems to ignore alignments in the input PAF that contain softmasked/lowercase sequence in the fasta.

I humbly suggest that this be turned off by default, and have it activated by command line option similarly to the other filters.

Or, if leaving it default, just a mention of how it works in the README would be very helpful.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ekg/seqwish/issues/55, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABDQEJQAJAUFYY5G35GGG3RYOF5FANCNFSM4OITJLQQ .

ekg commented 4 years ago

This should fix the usability issue.

Another thing is the rewriting of sequence names with the group prefix. That could just be done on the output graph, rather than on the input FASTAs.