immcantation / presto

pRESTO is part of the Immcantation analysis framework for Adaptive Immune Receptor Repertoire sequencing (AIRR-seq). pRESTO is a bioinformatics toolkit for processing high-throughput lymphocyte receptor sequencing data.
https://presto.readthedocs.io
GNU Affero General Public License v3.0
0 stars 0 forks source link

Replace usearch with an open source tool in AssemblePairs-reference #45

Closed ssnn-airr closed 7 years ago

ssnn-airr commented 7 years ago

Original report by Jason Vander Heiden (Bitbucket: javh, GitHub: javh).


vsearch doesn't provide local alignment methods, so we cannot replace usearch with vsearch in AssemblePairs-reference. We need an alternative.

Binaries for BWA and swipe and linux only, so that unfortunately invalidates them. Perhaps the SSW implementation in scikit-bio?

ssnn-airr commented 7 years ago

Original comment by Jason Vander Heiden (Bitbucket: javh, GitHub: javh).


Other options:

  1. parasail python bindings
  2. blastn, which is slower than usearch, but is already implemented.
ssnn-airr commented 7 years ago

Original comment by Jason Vander Heiden (Bitbucket: javh, GitHub: javh).


Added blastn support in 9c90566. usearch and blastn do not yield the same results, apparently because blastn is being more strict returning hits for some unknown reason.

ssnn-airr commented 7 years ago

Original comment by Jason Vander Heiden (Bitbucket: javh, GitHub: javh).


Syncing the word size and masking between ublast and blastn seems to have resolved most of the differences. Still some testing to do, but I'm going to go with the blastn solution for now.

Dockerfiles have also been updated with blast+ installs.

ssnn-airr commented 7 years ago

Original comment by Jason Vander Heiden (Bitbucket: javh, GitHub: javh).


Seems to work fine as of 5102751. There are minor differences between the usearch and blastn alignments, but this may be impractical to resolve (1 indel problem out of 100 alignments).