metagenome-atlas / atlas

ATLAS - Three commands to start analyzing your metagenome data
https://metagenome-atlas.github.io/
BSD 3-Clause "New" or "Revised" License
377 stars 99 forks source link

Atlas on test sample fails #210

Closed ahorvath closed 3 years ago

ahorvath commented 5 years ago

Hi,

I run into the following error running atlas run genomes (after qc and assembly):

[2019-06-07 18:35 CRITICAL] Command 'snakemake --snakefile /data10/programs/bin/anaconda2/lib/python3.6/site-packages/atlas/Snakefile --directory /data10/working_groups/horvath_group/Metagenome/atlas --printshellcmds --jobs 40 --rerun-incomplete --configfile '/data10/working_groups/horvath_group/Metagenome/atlas/config.yaml' --nolock --use-conda --conda-prefix /data10/working_groups/horvath_group/Metagenome/atlas/databases/conda_envs genomes ' returned non-zero exit status 1.

the full log is attached.

2019-06-07T183547.799298.snakemake.log

Can you help me with that? Thanks in advance. Bests, Attila

SilasK commented 5 years ago

Hey Attila

Thank you for using Atlas. Can you tell me which version of Atlas you are using?

The binnong step with metabat2 didn't produced any output.

Can you check the log file of the failed rule. It something like Mycoplasma/logs/binning/metabat.log

I'm a bit suspicious, your sample is called Mycoplasma. Is it a simple genome or a metagenome? Can you explain me what you want to do?

On Fri, Jun 7, 2019, 18:44 Attila Horvath notifications@github.com wrote:

Hi,

I run into the following error running atlas run genomes (after qc and assembly):

[2019-06-07 18:35 CRITICAL] Command 'snakemake --snakefile /data10/programs/bin/anaconda2/lib/python3.6/site-packages/atlas/Snakefile --directory /data10/working_groups/horvath_group/Metagenome/atlas --printshellcmds --jobs 40 --rerun-incomplete --configfile '/data10/working_groups/horvath_group/Metagenome/atlas/config.yaml' --nolock --use-conda --conda-prefix /data10/working 2019-06-07T183547.799298.snakemake.log https://github.com/metagenome-atlas/atlas/files/3266865/2019-06-07T183547.799298.snakemake.log _groups/horvath_group/Metagenome/atlas/databases/conda_envs genomes ' returned non-zero exit status 1.

the full log is attached.

Can you help me with that? Thanks in advance. Bests, Attila

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/metagenome-atlas/atlas/issues/210?email_source=notifications&email_token=ABZMVWVJ5GYE6NN6NN2COQTPZKGALA5CNFSM4HVYVJA2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4GYJNZFA, or mute the thread https://github.com/notifications/unsubscribe-auth/ABZMVWTSVLCKL4SNFZHB6MTPZKGALANCNFSM4HVYVJAQ .

ahorvath commented 5 years ago

I'm a newbie to atlas, I'd like to try out the whole pipeline with the example dataset. The version I use is 2.1.1.

I run these commands:

atlas init --db-dir databases example_data/ atlas run qc atlas run assembly atlas run genomes

Did I do anything wrong?

Many thanks for your help.

SilasK commented 5 years ago

No, the commands are totally correct. It my test data that is not optimal. I tried to use very light simulated reads so that I can run a test run rapidly. However, they don't pass all the time the binning step when I run it. But I haven't had the time to find better test data, sorry.

The good news you managed to install the pipeline and run successfully untill the assembly. I think there is no reason why atlas should fail on your real data. And if it does I'm happy to look into the issues.

What you can do:

  1. If you use a cluster system, try the automatic clister sibmission (see docs)
  2. Run atlas on your data (or the cami data if you want)
  3. If you want to make sure to download all the databases (100gb) first run : atlas download
ahorvath commented 5 years ago

I tried on real data (M08, https://www.ncbi.nlm.nih.gov/sra/?term=SRP042956) M08.fastq.gz

and got the following error. decontamination.log

The problem might be that these are 454 reads. Do I have to set the read length somewhere?

SilasK commented 5 years ago

Dear Attila,

the sample you are using is a 454 amplicon sequence file (16S). Atlas is a pipeline for (whole genome) metagenomics. I know that in publications the amplicon methods are called "metagenomics" because this is fancier.

If you want to analyze amplicon sequence data, I recommend dada2 https://benjjneb.github.io/dada2/index.html or qiime2 . both have tutorials. If you want to use metagenomics then you are welcome to try to use atlas with e.g. the cami 1 dataset:

"1st CAMI Challenge Dataset 1 CAMI_low" on page 2 on https://data.cami-challenge.org/participate