mahulchak / mito-finder

Assembling mitochondrial genomes from long molecule sequences
GNU General Public License v3.0
5 stars 1 forks source link

Detail on running blasr #1

Open 000generic opened 4 years ago

000generic commented 4 years ago

I just wanted to highlight what I think is a slight error in the blasr command line you provide:

blasr raw_reads.fasta mito.fasta -bestn 1 -m 1 -nproc n > mito.m1

needs to read:

blasr raw_reads.fasta mito.fasta --bestn 1 -m 1 --nproc n > mito.m1

and needs to have n replaced by the number of threads to use - which might be mentioned.

I haven't yet finished your pipeline but things are up and running - trying to assembly the pygmy squid Idiosepius paradoxus mito genome from ONT reads. Unfortunately, we had to shear the DNA to 20 kb to get PromethION to run without clogging - and the average ended up closing to 10 than 20 - but hopefully there are enough long ones in there for mito.

Thank you :)

mahulchak commented 4 years ago

You are right. What's the length of the mito genome in the squid?

On Sat, May 16, 2020 at 4:33 PM Eric Edsinger notifications@github.com wrote:

I just wanted to highlight what I think is a slight error in the blasr command line you provide:

blasr raw_reads.fasta mito.fasta -bestn 1 -m 1 -nproc n > mito.m1

needs to read:

blasr raw_reads.fasta mito.fasta --bestn 1 -m 1 --nproc n > mito.m1

and needs to have n replaced by the number of threads to use - which might be mentioned.

I haven't yet finished your pipeline but things are up and running - trying to assembly the pygmy squid Idiosepius paradoxus mito genome from ONT reads. Unfortunately, we had to shear the DNA to 20 kb to get PromethION to run without clogging - and the average ended up closing to 10 than 20 - but hopefully there are enough long ones in there for mito.

Thank you :)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/mahulchak/mito-finder/issues/1, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZQH2E5XJ4TIWGNZCKWK5DRR4PBBANCNFSM4NDD3E5A .

-- Mahul Chakraborty Department of Ecology and Evolutionary Biology University of California-Irvine Phone: 949 824 9559 Fax: 949 824 9559 Website: https://mahulchakraborty.wordpress.com/ Github: https://github.com/mahulchak

000generic commented 4 years ago

its just under 16,000 - seems like blasr-awk worked for pulling out reads of the approximately right size. I still need to extract them and align to them selves etc.

Now doing blasr-awk for pygmy octopus - also seems to be working - but species distances are greater - but running 14 different octopus species - so hopefully some of them will work better than others.

I find that blasr pulls out the same or subsets of the same reads for different octopus species mito genomes - but the amount of alignment/coverage depends on the phylogenetic distance of mito to reads species. Which makes sense. I know the genome will likely be 15,000-17,000 - but I have to down awk cutoff to 10,000-12,000 or less to find the same reads as phylogenetic distance increases.