immcantation / presto

pRESTO is part of the Immcantation analysis framework for Adaptive Immune Receptor Repertoire sequencing (AIRR-seq). pRESTO is a bioinformatics toolkit for processing high-throughput lymphocyte receptor sequencing data.
https://presto.readthedocs.io
GNU Affero General Public License v3.0
0 stars 0 forks source link

maskPrimers gapped alignment penalties #20

Closed ssnn-airr closed 9 years ago

ssnn-airr commented 9 years ago

Original report by Anonymous.


I believe a gap open should be penalized more than an substitution. Though depending on the technology and the length of the primer i can see arguments against this.

An ungapped alignment option could support the extreme case of this.

Unexpected gapping becomes an issue when amplicon structure is defined and processed according to the expected lengths of various components.

We can get around this in a number of ways, but i wanted to put the idea out there.

ssnn-airr commented 9 years ago

Original comment by Jason Vander Heiden (Bitbucket: javh, GitHub: javh).


Might just be best to add a flag for the gap penalty, which I can do. I think how common this is will depend on the platform (454 vs Illumina).

ssnn-airr commented 9 years ago

Original comment by Jason Vander Heiden (Bitbucket: javh, GitHub: javh).


Slowly getting through the issues...

I will confess this is partly motivated by the fact that I couldn't get deletions in the input sequence to translate to the ERROR rate in MaskPrimers exactly how I wanted them too... Example:

  SEQ> CGGATCTTCTACTCAAAACCGTCC-TCAGTCGTGGATCTGGTCTAGCTGGG
   PR> ---------------AATACGTCCGTCAGTCGTGGATGT------------
  ALN>                  **     -            *
ERROR> 0.208333

ERROR is 5/24 (20 character matches, -1 gap penalty), when it should probably be 4/24, but fixing this leads to some nasty side effects that break otherwise correct alignments. Just FYI.

If you want to see what effect changing the gap penalty has on the alignments, you can just change the gap_penalty=(1, 1) argument line 132 in tests/tests_MaskPrimers.py and look at the test case output.

ssnn-airr commented 9 years ago

Original comment by Jason Vander Heiden (Bitbucket: javh, GitHub: javh).


Added a --gap argument to MaskPrimers-align to allow specification of the gap open and gap extension penalties.

ssnn-airr commented 9 years ago

Original comment by Anonymous.


cool!