Closed lauramason326 closed 2 years ago
Hi Laura,
I take it you mean singlem condense, not coverm.
- What does coverage mean here? Is it useful for undersanding the number of OTUs with this taxonomy in the sample?
No not really. Coverage is the estimation of what a genome coverage would be if the whole genome was available and a read mapping was done. It is based on the number of reads that have that taxonomy, and the length of those reads (like kmer coverage)
How is the taxonomy assigned? What happens if different markers align with different portions of the same gene and the different portions correspond to different taxa?
Currently at least, only 1 portion of each gene is in the packages. Not sure how different portions correspond to diff taxa.
I am trying to import the data into lefse to obtain differentially enriched OTUs for my treatments. Is the output from condense what I want for this?
You could get diff abundance taxons, not enriched OTUs.
Hi Ben Thanks for the info. I think you're response may have been cut off though - "No, it isn't helpful except i" is where the post stops - at least on my end. Can you elaborate please? Thank you! Laura
Sorry that last bit was leftovers from an earlier draft - forgot to delete it - have edited now.
Is some part still unclear?
Yes -thank you. Laura
Hi @wwood this is Gin, I am working with 2 non-EMERGE scientists (PhD student Laura Mason @lauramason326 , with whom you corresponded above, thank you so much; Laura works with co-supervisor Prof Richard Dick on ag micro at OSU; and now-postdoc Dawson Fairbanks, who is co-supervised by Prof. Rachel Gallery at the University of Arizona), and I had a couple Qs as we work through their SingleM analyses of their metaGs & which I can't find in the readme (apologies if I'm missing them!):
Thanks and so sorry for the slew of Qs!! g
Hi @Virginia-Rich yes the documentation could be better. This is a consequence of it being under active development.
Sorry for the instability here, but hope that helps.
Also, to your point about some genes having more taxonomic assignment potential than others, definitely agree. I'm trying to devise a condense implementation that more gracefully handles that, there some scratchings on the whiteboard.
I also have an r207 metapackage, if that would help?
Hi @wwood; re. issue 6. here is the code I used running condense with the updated package
singlem pipe --forward ${SAMPLE}_R1.fastq --reverse ${SAMPLE}_R2.fastq --otu-table $outdir/${SAMPLE}.otu_table.csv --threads 20 --singlem-metapackage /fs/project/PAS1212/bioinformatic_tools/singlem_db/S3.0.1.metapackage_20211101.smpkg
singlem summarise --input_otu_tables *.csv --output_otu_table combined.otu_table.csv
singlem summarise --input_otu_tables combined.otu_table.csv --cluster --cluster-id 0.95 --clustered_output_otu_table newcsvfile.csv
singlem condense --input_otu_tables newcsvfile.csv --output_otu_table condense.csv --singlem-metapackage /fs/project/PAS1212/bioinformatic_tools/singlem_db/S3.0.1.metapackage_20211101.smpkg
Thanks Dawson, would you mind emailing me the combined otu table please?
Hi Ben, Per our discussion on this post, I ran coverM in condense mode However, I have a few clarifying questions:
Thanks in advance, Laura