Closed merryblues closed 10 months ago
Could you post the summary.seqs output for the fasta and count file before you run the screen.seqs command?
mothur > summary.seqs(fasta=stability.trim.contigs.good.unique.align, count=staility.trim.contigs.good.count_table)
make.contigs(file=stability.files, processors=8) summary.seqs(fasta=stability.trim.contigs.fasta) Using 8 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 35 35 0 2 1 2.5%-tile: 1 434 434 0 4 203069 25%-tile: 1 441 441 6 5 2030689 Median: 1 456 456 16 6 4061378 75%-tile: 1 460 460 23 7 6092066 97.5%-tile: 1 573 573 35 11 7919686 Maximum: 1 602 602 61 300 8122754 Mean: 1 456 456 15 6
screen.seqs(fasta=stability.trim.contigs.fasta, group=stability.contis.groups, summary=stability.trim.contigs.summary, maxambig=0, maxlength=275) unique.seqs(fasta=stability.trim.contigs.good.fasta) Unable to open stability.trim.contigs.good.fasta. Trying default /Users/claudiaperdomo/Desktop/cabras/stability.trim.contigs.good.fasta. 1000 710 1273 864 summary.seqs(count=stability.trim.contigs.good.count_table) Start End NBases Ambigs Polymer NumSeqs Minimum: 1 35 35 0 2 1 2.5%-tile: 1 51 51 0 3 32 25%-tile: 1 82 82 0 4 319 Median: 1 96 96 0 4 637 75%-tile: 1 164 164 0 6 955 97.5%-tile: 1 227 227 0 6 1242 Maximum: 1 274 274 0 9 1273 Mean: 1 119 119 0 4
total # of seqs: 1273
It took 0 secs to summarize 1273 sequences. align.seqs(fasta=ecoli.16srrna.pcr.fasta, reference=silva.bacteria.fata) It took 17 to read 14956 sequences. pcr.seqs(fasta=silva.bacteria.fasta, start=6428, end=23444, keepdots=, processors=8) rename.file(input=silva.bacteria.pcr.fasta, new=silva.v4.fasta) summary.seqs(fasta=silva.v4.fasta) Start End NBases Ambigs Polymer NumSeqs Minimum: 1 16155 383 0 3 1 2.5%-tile: 2 17016 403 0 4 374 25%-tile: 2 17016 406 0 4 3740 Median: 2 17016 425 0 5 7479 75%-tile: 2 17016 428 0 5 11218 97.5%-tile: 2 17016 429 1 6 14583 Maximum: 8 17016 462 5 9 14956 Mean: 2 17015 418 0 4
align.seqs(fasta=stability.trim.contigs.good.unique.fasta, reference=ilva.v4.fasta) [WARNING]: 203 of your sequences generated alignments that eliminated too many bases, a list is provided in /Users/claudiaperdomo/Desktop/cabras/stability.trim.contigs.good.unique.flip.accnos. [NOTE]: 91 of your sequences were reversed to produce a better alignment. summary.seqs(fasta=stability.trim.contigs.good.unique.align, count=stbility.trim.contigs.good.count_table) Start End NBases Ambigs Polymer NumSeqs Minimum: 0 0 0 0 1 1 2.5%-tile: 2 1273 7 0 2 32 25%-tile: 9932 17016 51 0 3 319 Median: 15618 17016 66 0 4 637 75%-tile: 15657 17016 130 0 4 955 97.5%-tile: 16155 17016 182 0 6 1242 Maximum: 17016 17016 255 0 7 1273 Mean: 13639 16304 86 0 4
total # of seqs: 1273
It took 0 secs to summarize 1273 sequences.screen.seqs(fasta=stability.trim.contigs.good.unique.align, count=staility.trim.contigs.good.count_table, summary=stability.trim.contigs.good.unique.summary, start=1800, end=11990, maxhomop=8)/**/ Running command: remove.seqs(accnos=/Users/claudiaperdomo/Desktop/cabras/stability.trim.contigs.good.unique.bad.accnos.temp, count=/Users/claudiaperdomo/Desktop/cabras/stability.trim.contigs.good.count_table)
Removing group: L10B because all sequences have been removed. , , , summary.seqs(fasta=current, count=current) Using 8 processors. [ERROR]: /Users/claudiaperdomo/Desktop/cabras/stability.trim.contigs.good.good.count_table is blank. Please correct. [ERROR]: /Users/claudiaperdomo/Desktop/cabras/stability.trim.contigs.good.good.count_table is blank. Please correct. [ERROR]: /Users/claudiaperdomo/Desktop/cabras/stability.trim.contigs.good.unique.good.align is blank. Please correct. Error in reading your fastafile, at position -1. Blank name.
mothur > summary.seqs(fasta=stability.trim.contigs.good.unique.align, count=stbility.trim.contigs.good.count_table) Start End NBases Ambigs Polymer NumSeqs Minimum: 0 0 0 0 1 1 2.5%-tile: 2 1273 7 0 2 32 25%-tile: 9932 17016 51 0 3 319 Median: 15618 17016 66 0 4 637 75%-tile: 15657 17016 130 0 4 955 97.5%-tile: 16155 17016 182 0 6 1242 Maximum: 17016 17016 255 0 7 1273 Mean: 13639 16304 86 0 4
mothur > screen.seqs(fasta=stability.trim.contigs.good.unique.align, count=staility.trim.contigs.good.count_table, summary=stability.trim.contigs.good.unique.summary, start=1800, end=11990, maxhomop=8)
The start and end values are causing all your sequences to be scrapped. https://mothur.org/wiki/screen.seqs/#start--end. The start parameter tells mothur to remove all sequences that start after 1800. The end parameter tells mothur to remove all sequences that end before 11990. Together any sequence that starts after 1800 or ends before 11990 will be removed.
hi now I have this problem
summary.seqs(fasta=current, count=current) Using stability.trim.contigs.good.unique.good.filter.unique.precluster.count_table as input file for the count parameter. Using stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.fasta as input file for the fasta parameter.
Using 8 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 10 10 0 1 1 2.5%-tile: 1 10 10 0 2 28 25%-tile: 1 10 10 0 2 274 Median: 1 10 10 0 3 548 75%-tile: 1 10 10 0 3 822 97.5%-tile: 1 10 10 0 3 1068 Maximum: 1 10 10 0 6 1095 Mean: 1 10 10 0 2
total # of seqs: 1095
It took 0 secs to summarize 1095 sequences.
Output File Names: stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.summary
mothur > classify.seqs(fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.pik.fasta, count=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.count_table, reference=trainset9_032012.pds.fasta, taxonomy=trainset9_032012.pds.tax, cutoff=80) [ERROR]: stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.count_table is blank. Please correct. [ERROR]: M02738_15_000000000-CCG6C_1_1104_23669_2431 is not in your count table. Please correct.
All your reads are only 10 bases long. Something went wrong before this point.
What parameters did you choose for the screen.seqs command above?
screen.seqs(fasta=stability.trim.contigs.good.unique.align, count=stability.trim.contigs.good.cont_table, summary=stability.trim.contigs.good.unique.summary, start=16155, end=17016, maxhomop=8)
summary.seqs(fasta=current, count=current) Start End NBases Ambigs Polymer NumSeqs Minimum: 7431 17016 10 0 1 1 2.5%-tile: 9520 17016 12 0 3 28 25%-tile: 14856 17016 63 0 4 274 Median: 15618 17016 66 0 4 548 75%-tile: 15625 17016 130 0 6 822 97.5%-tile: 16147 17016 182 0 6 1068 Maximum: 16152 17016 255 0 7 1095 Mean: 14084 17016 95 0 4
total # of seqs: 1095
filter.seqs(fasta=stability.trim.contigs.good.unique.good.align, vertical=T, trump=.)
unique.seqs(fasta=stability.trim.contigs.good.unique.good.filter.fasta, count=stability.trim.conigs.good.good.count_table) 691 60
pre.cluster(fasta=stability.trim.contigs.good.unique.good.filter.unique.fasta, count=stability.tim.contigs.good.unique.good.filter.count_table, diffs=2)
remove.seqs(fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.fasta, accnosstability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.accnos)
summary.seqs(fasta=current, count=current)
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 10 10 0 1 1 2.5%-tile: 1 10 10 0 2 28 25%-tile: 1 10 10 0 2 274 Median: 1 10 10 0 3 548 75%-tile: 1 10 10 0 3 822 97.5%-tile: 1 10 10 0 3 1068 Maximum: 1 10 10 0 6 1095 Mean: 1 10 10 0 2
total # of seqs: 1095
The make.contigs command assembles over 8 million reads.
All but ~1200 reads are removed with the first screen.seqs commands due to the maxlength parameter. What region are you sequencing?
The second screen.seqs command is not creating good overlap, which is why the filter.seqs command removes so many alignment columns.
https://mothur.org/wiki/filter.seqs/
The trump option in the filter.seqs command will remove a column if the trump character is found at that position in any sequence of the alignment. You can use any character with the trump setting (‘.’, ‘-‘, ‘N’, etc). NOTE: having one or two sequences included that don’t align with the bulk of your sequences may lead to all columns being removed by the trump option!
1, over 8 million reads is good or bad
...thank for your help im really new with Mothur
Is 8 million reads a "good" or "bad" number of reads?
The number of reads assembled by make.contigs is determined by the number of reads in the input files, and the parameters set. You have ~8 million reads in your r1 / r2 files. Instead of looking at the number or reads as good or bad, we try to focus on the quality of the reads produced in the analysis. This paper https://www.ncbi.nlm.nih.gov/pubmed/22194782 explains our error reduction approach.
How do I determine the maxlength and maxambig parameters? Why is good overlap important?
The maxlength is set to 275 in the MISeq_SOP because of the region we are sequencing. Pat, talks about the importance of the region and good overlap in this blog, https://mothur.org/blog/2014/Why-such-a-large-distance-matrix/. Without good overlap, the error rate and the number of spurious OTUs increases.
Some other new users links you may find helpful:
https://mothur.org https://mothur.org/wiki/frequently_asked_questions/ https://forum.mothur.org https://mothur.org/wiki/mothur_manual/
https://mothur.org/wiki/miseq_sop/ - Pat’s example analysis https://mothur.org/wiki/silva_reference_files/ https://mothur.org/wiki/rdp_reference_files/ https://mothur.org/wiki/greengenes-formatted_databases/
hi I run
screen.seqs(fasta=stability.trim.contigs.good.unique.align, count=staility.trim.contigs.good.count_table, summary=stability.trim.contigs.good.unique.summary, start=1800, end=11990, maxhomop=8)
and after summary.seqs(fasta=current, count=current)
and my output sing 8 processors. [ERROR]: /Users/claudiaperdomo/Desktop/cabras/stability.trim.contigs.good.good.count_table is blank. Please correct. [ERROR]: /Users/claudiaperdomo/Desktop/cabras/stability.trim.contigs.good.good.count_table is blank. Please correct. [ERROR]: /Users/claudiaperdomo/Desktop/cabras/stability.trim.contigs.good.unique.good.align is blank. Please correct. Error in reading your fastafile, at position -1. Blank name.
what's wrong what should I do thanks!!