Open hyphaltip opened 5 months ago
https://github.com/lh3/minimap2/tree/master/python Seems like the team also developed a python binding for minimap2. Haven't used that but worth of trying to see if it can output the results to a python object which can be directly manipulated/filtered under python. It can be installed from Pypi or conda, would be great for dependency management.
Pypi: https://pypi.org/project/mappy/ Conda: https://anaconda.org/bioconda/mappy
for using minimap2 to screen reads for TEs we may not really want to run this as PE data.
--sr | Enable short-read alignment heuristics. In the short-read mode, minimap2 applies a second round of chaining with a higher minimizer occurrence threshold if no good chain is found. In addition, minimap2 attempts to patch gaps between seeds with ungapped alignment.
--for-only | Only map to the forward strand of the reference sequences. For paired-end reads in the forward-reverse orientation, the first read is mapped to forward strand of the reference and the second read to the reverse stand.
consider that these options for minimap2 with paired-end data assume we want to map on reverse strand for paired-end data, while I think we may want to just screen the paired-end data
Okay current code can generate BAM files for reads to TE library via minimap2 and I am running left and right files separately. We can work on the algorithm implementation for the TE containing regions of the reads.
Develop a minimap2 alignment [see docs] function that uses minimap2 (installed via conda or in the PATH), we can run this so it produces a BAM file. The input options are probably
This should be part of development of aligner.py library unless we change the name, see #1