mothur / mothur.github.io

wiki for the mothur software package
https://mothur.github.io
Creative Commons Attribution 4.0 International
19 stars 20 forks source link

nanopore reads #126

Open krmaas opened 2 months ago

krmaas commented 2 months ago

I'm trying to run some nanopore test reads through mothur but am getting stuck at the very beginning. I was going to follow your pacbio example so was trying to start with fastq.info. My data are concatenated fastq.gz as spit out by minknow. But fastq.info doesn't seem to take .gz? is there another way that I should get these reads into a format for mothur?

thanks

Kendra

pschloss commented 2 months ago

Can you try to run gunzip on the file and then use the fastq file?

On Tue, Apr 23, 2024 at 1:43 PM Kendra Maas @.***> wrote:

I'm trying to run some nanopore test reads through mothur but am getting stuck at the very beginning. I was going to follow your pacbio example so was trying to start with fastq.info. My data are concatenated fastq.gz as spit out by minknow. But fastq.info doesn't seem to take .gz? is there another way that I should get these reads into a format for mothur?

thanks

Kendra

— Reply to this email directly, view it on GitHub https://github.com/mothur/mothur.github.io/issues/126, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJUUBDTPBRPELJQFTPBHDTY62MT3AVCNFSM6AAAAABGVMOF4SVHI2DSMVQWIX3LMV43ASLTON2WKOZSGI2TSNBSGMZDMOI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

krmaas commented 2 months ago

yeah, sorry it was that easy of a fix.

can I ask another question here? other than aligning seqs, is there a way to search for primers that may not be the very beginning or end of the sequence? Quality of nanopore isn't great for the first ~10bp so I think I'm going to throw out too many seqs if I search for the full primer/adapter.

pschloss commented 2 months ago

Not exactly... I think an "easy" thing to try would be to generate a reference alignment for the positions without the primers and align your sequences to that. The parts of your sequences that overlap the missing area of the reference alignment would be dropped. That should get rid of the primers (and barcodes).

Pat

On Wed, Apr 24, 2024 at 12:20 PM Kendra Maas @.***> wrote:

yeah, sorry it was that easy of a fix.

can I ask another question here? other than aligning seqs, is there a way to search for primers that may not be the very beginning or end of the sequence? Quality of nanopore isn't great for the first ~10bp so I think I'm going to throw out too many seqs if I search for the full primer/adapter.

— Reply to this email directly, view it on GitHub https://github.com/mothur/mothur.github.io/issues/126#issuecomment-2075356097, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJUUBD6X2W3OBZOTJ6ZECTY67LVJAVCNFSM6AAAAABGVMOF4SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANZVGM2TMMBZG4 . You are receiving this because you commented.Message ID: <mothur/mothur. @.***>

krmaas commented 2 months ago

Sorry not getting what happens to the parts of the seq that are before or after the alignment. Do those get trimmed by align.seqs?

I'm aligning the whole data set right now against the full silva nr, it's just taking a while because I'm not reducing data much with unique.seqs-dropped a few thousand seqs out of 7M.

krmaas commented 2 months ago

wait, what about pcr.seqs? going to give that a try

pschloss commented 2 months ago

They would get trimmed off. If it's not reducing much, that's likely a quality issue. Could also align and use pre.cluster with diffs=15

Pat

On Wed, Apr 24, 2024 at 5:07 PM Kendra Maas @.***> wrote:

Sorry not getting what happens to the parts of the seq that are before or after the alignment. Do those get trimmed by align.seqs?

I'm aligning the whole data set right now against the full silva nr, it's just taking a while because I'm not reducing data much with unique.seqs-dropped a few thousand seqs out of 7M.

— Reply to this email directly, view it on GitHub https://github.com/mothur/mothur.github.io/issues/126#issuecomment-2075843268, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJUUBHPBWBHPSACC7OU7CLY7ANHXAVCNFSM6AAAAABGVMOF4SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANZVHA2DGMRWHA . You are receiving this because you commented.Message ID: <mothur/mothur. @.***>

pschloss commented 2 months ago

That should work too. You can play with the pdiffs to add non-specificity to account for sequencing errors Pat

On Wed, Apr 24, 2024 at 5:14 PM Kendra Maas @.***> wrote:

wait, what about pcr.seqs? going to give that a try

— Reply to this email directly, view it on GitHub https://github.com/mothur/mothur.github.io/issues/126#issuecomment-2075856425, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJUUBHDIQHGRZHOEG6NC3TY7AOEHAVCNFSM6AAAAABGVMOF4SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANZVHA2TMNBSGU . You are receiving this because you commented.Message ID: <mothur/mothur. @.***>

krmaas commented 2 months ago

oh I should do that. I ran it once and it removed ~75% of the sequences (fine whatever). But when i aligned only a few thousand out of 1.8M needed to be reversed which strikes me as odd. I've restarted pcr.seqs with checkorient=T but maybe I needed to play with pdiffs