metagenome-atlas / atlas

ATLAS - Three commands to start analyzing your metagenome data
https://metagenome-atlas.github.io/
BSD 3-Clause "New" or "Revised" License
368 stars 97 forks source link

atlas ,how to not bin? #549

Closed luozhy88 closed 2 years ago

luozhy88 commented 2 years ago

I need get a taxnomy and otu table which the input from contings , but the atlas create a taxnomy which contain binning ID.

atlas run genomes --skip-binning --resources mem=600 -j 130 is error!

SilasK commented 2 years ago

Hmm, I'm happy to help ou but I don't understand what you need. What microbiome do you have?

Atlas is designed to assemble and bin genomes from your reads and then use them for quantification.

Depending on the microbiome you want you could get reference genomes that are already assembled. In which case you don't need to assemble.

You can also taxonomically annotate your contigs but don't get an OTU.

What do you exactly mean by OTU, usually the term is used for 16S sequencing.

luozhy88 commented 2 years ago

Thanks,I want to annotate my contigs by atlas.How?

atlas run genomes --skip-binning

image
SilasK commented 2 years ago

Set "SemiBin" as final_binner in the config file. And run atlas

atlas run binning --omit-from semibin_train

Run this first with the --dryrun option to check what it does.

It should filter contigs <1500bp and annotate each sample with the taxonomy

luozhy88 commented 2 years ago

Thanks! it works!

image

but I have a problem about raw count.I need a table which the row names are contings and the cloumns are SampleID.How?

image
SilasK commented 2 years ago

For each sample you have the coverage in "{sample}/assembly/contig_stats/postfilter_coverage_stats.txt" in the same folder you even find it by base and by 1kb block if this is of interest to you.

luozhy88 commented 2 years ago

if we want to analyze different samples for difference analysis for contings, then we can make a unified table. The columns are SampleID and the rownames are cotings. How can I get the table ?

when I use "atlas run binning --omit-from semibin_train", the output is the table which contain only one sample.

SilasK commented 2 years ago

Can you explain what you want to do? I do not do a coassembly. each sample get's assembled, separately. also, the taxonomic annotation is per sample.

I create a unified genecatalog and a unified set of MAGs for quantification. However, I do not know how to combine all the contigsas there are many partially similar contigs in different samples.

luozhy88 commented 2 years ago

Thanks, Yes,I want to combine all the contigs in different samples in one table. Maybe I need use other method

luozhy88 commented 1 year ago

Because I changed the samples.tsv file, and save as samles_new.tsv . can I specify its absolute path? atlas run qc --samples /home/zhiyu/atlas/samles_new.tsv ,right?

SilasK commented 1 year ago

No you cannot specify the path to the samples.tsv. You should rename the old to samples_old.tsv and the new to samples.tsv. By the way if you have questions about the command line interface simply write --help .

SilasK commented 1 year ago

Thanks, Yes,I want to combine all the contigs in different samples in one table. Maybe I need use other method

By the way there is a way to do what you want with atlas:

run:

atlas run None "Cobinning/vamb/coverage.tsv" <other params>

This uses minimap to map all reads to the combined filtered contigs, which you have taxonomically annotated before. However I'm not sure how this approach handles the multi mapping.

It's uses in theory for vamb binning, but you are not interested in binding as I understood.

luozhy88 commented 1 year ago

https://github.com/metagenome-atlas/atlas/issues/549#issuecomment-1250222896

I often run a lot of batch data by atlas if rename is troublesome. if sample.tsv is specified, it is convenient for me to record.

SilasK commented 1 year ago

I don't understand.

luozhy88 commented 1 year ago

If I have 3 batches running in the same directory for sample.tsv, I will change my name three times. So I want to specify the name. Example: sample batch1.tsv sample batch2.tsv sample _ batch3.tsv.

SilasK commented 1 year ago

Why would you run three batches in the same directory?

Maybe you don't know that atlas runs already everything in parallel and can be effiently be deployed on a cluster. See the docs.

Once the qc step passed the sample.tsv is no longer altered and you could run batches. If really necessary.