Closed eparejatobes closed 7 years ago
@marina-manrique results of the latest MG7 run on the illumina/pacbio mock data are in
s3://resources.ohnosequences.com/ohnosequences/mg7/1.0.0-M5-pr78-158-ge56bab7/test/
I'm releasing MG7 1.0.0-M5 based on the corresponding version and writing missing docs in this issue.
Cool! I'm checking them today!
I'm reviewing this.
There are some serious issues with the output in s3://resources.ohnosequences.com/ohnosequences/mg7/1.0.0-M5-pr78-158-ge56bab7/test/pacbio/.
,Taxa,,,0.0000,NaN
0.0000
everywhere-
in some cases; is this expected? if so, why?@laughedelic please mark 1.0.0-M5 as broken, and open issues for all of the above.
@eparejatobes thanks for the feedback!
Also I see a lot of no-hits. Which params were used here? m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs
from staggered for example, has
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS000095DC7A,99.59,1474,4,2,1,1495,0.0,2687,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS000002B710,99.59,1474,4,2,1,1492,0.0,2687,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS00008E4995,99.59,1474,4,2,1,1501,0.0,2687,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS00002E8C4B,99.59,1474,4,2,1,1501,0.0,2687,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS00005C9B1E,99.59,1474,4,2,1,1482,0.0,2687,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS000003687D,99.59,1474,4,2,1,1493,0.0,2687,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS0000520518,99.59,1474,4,2,1,1501,0.0,2687,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS0000052F24,99.59,1474,4,2,1,1473,0.0,2687,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS0000586E96,99.59,1474,4,2,1,1501,0.0,2687,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS00006160CB,99.53,1474,5,2,1,1473,0.0,2684,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS000048ACA2,99.53,1474,5,2,1,1473,0.0,2684,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS0000774040,99.53,1474,5,2,1,1501,0.0,2682,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS0000695978,99.53,1474,5,2,1,1500,0.0,2682,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS0000832597,99.59,1471,4,2,1,1487,0.0,2682,99
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS0000051E9E,99.53,1474,5,2,1,1501,0.0,2682,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS00007A220D,99.53,1474,5,2,1,1475,0.0,2682,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS00007B0CD4,99.53,1474,5,2,1,1475,0.0,2682,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS000025D69C,99.53,1474,5,2,1,1501,0.0,2682,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS0000826EAF,99.53,1474,5,2,1,1489,0.0,2682,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS00007F2B41,99.53,1474,5,2,1,1501,0.0,2682,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS0000468437,99.53,1474,5,2,1,1501,0.0,2682,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS00000589C9,99.53,1474,5,2,1,1501,0.0,2682,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS0000037144,99.53,1474,5,2,1,1501,0.0,2682,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS00007BC3DC,99.53,1474,5,2,1,1493,0.0,2682,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS000012200D,99.53,1474,5,2,1,1501,0.0,2682,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS00004EF4A8,99.53,1474,5,2,1,1494,0.0,2682,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS00001F158F,99.53,1474,5,2,1,1501,0.0,2682,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS00000C125A,99.53,1474,5,2,1,1501,0.0,2682,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS0000186B7C,99.53,1474,5,2,1,1501,0.0,2682,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS00004AAA6D,99.53,1474,5,2,1,1493,0.0,2682,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS0000029715,99.53,1474,5,2,1,1486,0.0,2682,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS00001BA799,99.53,1474,5,2,1,1490,0.0,2682,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS000042A81F,99.53,1474,5,2,1,1493,0.0,2682,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS00003CF2A3,99.53,1474,5,2,1,1501,0.0,2682,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS00003F90D9,99.53,1474,5,2,1,1477,0.0,2682,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS00004F66E9,99.53,1474,5,2,1,1509,0.0,2682,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS000006F338,99.53,1474,5,2,1,1473,0.0,2682,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS00007DF57E,99.53,1474,5,2,1,1492,0.0,2682,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS0000804FB8,99.53,1474,5,2,1,1501,0.0,2682,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS000008006E,99.53,1475,4,3,1,1494,0.0,2682,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS00008DD3FF,99.53,1474,4,3,1,1500,0.0,2680,100
m150115_081355_sherri_c100725952230000001823153204301532_s1_p0/30523/ccs,gnl|ohnosequences.db.rna16s|URS000051015B,99.52,1473,5,2,1,1499,0.0,2680,100
...
- this is expected for the taxa that has no direct assignments (i.e. no hits, no pidents)
In that case it should be the weighted average of their descendants
In that case it should be the weighted average of their descendants
I can't find the discussion here, but I remember that talked about it and if I remember right, decided not to implement this feature in the v1.0. I could forget or mix it up, of course.
Also I see a lot of no-hits. Which params were used here?
See the defaults code.
OK fine about average identity. With respect to the no-hits issue, the only reason I can think of is word size. Next time you run this (after fixing those bugs above) use the same word size that is the global default: word_size(46).
fine about average identity
I opened https://github.com/ohnosequences/mg7/issues/126 not to forget to do it later.
Next time you run this (after fixing those bugs above) use the same word size that is the global default: word_size(46)
OK
oh I almost forgot; what about the coverage filter? is it 100%?
Yes. The default filter for both Illumina and PacBio is qcovs == 100
. Do you want to change it?
For pacbio 99 and 98.5 identity
@eparejatobes Done. Review the results please:
s3://resources.ohnosequences.com/ohnosequences/mg7/1.0.0-M5-15-gfb8a06a/test/
I'm working on this
@eparejatobes told me that it's fine. Merging this.
This is a portmanteau issue covering