metagenome-atlas / atlas

ATLAS - Three commands to start analyzing your metagenome data
https://metagenome-atlas.github.io/
BSD 3-Clause "New" or "Revised" License
372 stars 98 forks source link

Atlas for metatranscriptomes #143

Closed SilasK closed 3 years ago

SilasK commented 5 years ago

i did also have one unrelated question that I should probably post in another thread, but... i know ATLAS is currently geared towards bacteria, but are there any recommendations for modifications to the standard pipeline when dealing with metatranscriptomes that have high eukaryote abundance?

@mykophile I haven't used Atlas for metatranscriptomes. Do you have DNA and RNA or only RNA?

You should use rna-spades, by changing the following parameters in the config file.

assembler: spades
spades_preset: rna
spades_extra: "" # extra command line options

I added the option in the lates commits to the master branch. So you would need to update.

I think prodigal can predict genes in assembled rna, so cou can make a gene catalog using atlas assemble gene_catalog

and atlas assembleto get all the other reports and so on.

mykophile commented 5 years ago

I've got RNA only, about 50% Euk and 50% Prok. I think the main problem is that Prokka and Prodigal are trained for prokaryotic genomes. I wonder it would be reasonable to skip gene prediction and just use Diamond on assembled transcripts since it searches all possible reading frames.

SilasK commented 5 years ago

Should this become part of Atlas? https://www.biorxiv.org/content/early/2018/08/07/386110.full.pdf+html

SilasK commented 5 years ago

@mykophile Can I just take the RNA contigs as genes and cluster them with cd-hit? Do you want to use diamond to annotate the transcripts to eggNOG and the taxonomy?

mykophile commented 5 years ago

PLASS looks quite promising and would be great to see as part of ATLAS. I was thinking of using diamond for the annotation / taxonomy steps, not clustering.

SilasK commented 5 years ago

@MCamp91 Your happy to continue the discussen here. I have no metatranscriptome data available. So, I could not test it.

I'm not sure what the contigs of rnaSpades look like, and if prodigal can predict genes on them or if they are the genes.

Could you test to run rnaSpades, and then send me one of the ouptut files.

e.g. sample/sample_contigs.fasta

MCamp91 commented 5 years ago

Matthew Campbell has shared a OneDrive for Business file with you. To view it, click the link below.

https://studentcurtinedu-my.sharepoint.com/personal/15164810_student_curtin_edu_au/Documents/Attachments/MM1-SB_contigs(1).fasta [https://r1.res.office365.com/owa/prem/images/dc-generic_20.png]https://studentcurtinedu-my.sharepoint.com/personal/15164810_student_curtin_edu_au/Documents/Attachments/MM1-SB_contigs(1).fasta

MM1-SB_contigs(1).fastahttps://studentcurtinedu-my.sharepoint.com/personal/15164810_student_curtin_edu_au/Documents/Attachments/MM1-SB_contigs(1).fasta

Hello,

I've attached an output file from the run I did with metahit. The pipeline was unable to assemble contigs with spades and suggested config settings.

However, the spades run produced these kinds of outputs in the assembly - hard_filtered_transcripts.fasta, transcripts.fasta, before_rr.fasta dataset.info and soft_filtered_transcripts.fasta .... would you be interested in seeing any of these?

Cheers,

Matt


From: Silas Kieser notifications@github.com Sent: Wednesday, 13 February 2019 4:22:29 PM To: metagenome-atlas/atlas Cc: Matthew Campbell; Mention Subject: Re: [metagenome-atlas/atlas] Atlas for metatranscriptomes (#143)

@MCamp91https://github.com/MCamp91 Your happy to continue the discussen here. I have no metatranscriptome data available. So, I could not test it.

I'm not sure what the contigs of rnaSpades look like, and if prodigal can predict genes on them or if they are the genes.

Could you test to run rnaSpades, and then send me one of the ouptut files.

e.g. sample/sample_contigs.fasta

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/metagenome-atlas/atlas/issues/143#issuecomment-463103413, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AtYbXUFOc1DTe1RTi56fV6X6udRV__Cgks5vM8tFgaJpZM4Xsvhl.

MCamp91 commented 5 years ago

Looks like the attachment didn't work.

https://www.dropbox.com/s/juc16etah9tev50/MM1-SB_contigs.fasta?dl=0

Let me know if you can use this link.

Matt


From: Matthew Campbell Sent: Wednesday, 13 February 2019 9:51:55 PM To: metagenome-atlas/atlas; metagenome-atlas/atlas Cc: Mention Subject: Re: [metagenome-atlas/atlas] Atlas for metatranscriptomes (#143)

Hello,

I've attached an output file from the run I did with metahit. The pipeline was unable to assemble contigs with spades and suggested config settings.

However, the spades run produced these kinds of outputs in the assembly - hard_filtered_transcripts.fasta, transcripts.fasta, before_rr.fasta dataset.info and soft_filtered_transcripts.fasta .... would you be interested in seeing any of these?

Cheers,

Matt


From: Silas Kieser notifications@github.com Sent: Wednesday, 13 February 2019 4:22:29 PM To: metagenome-atlas/atlas Cc: Matthew Campbell; Mention Subject: Re: [metagenome-atlas/atlas] Atlas for metatranscriptomes (#143)

@MCamp91https://github.com/MCamp91 Your happy to continue the discussen here. I have no metatranscriptome data available. So, I could not test it.

I'm not sure what the contigs of rnaSpades look like, and if prodigal can predict genes on them or if they are the genes.

Could you test to run rnaSpades, and then send me one of the ouptut files.

e.g. sample/sample_contigs.fasta

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/metagenome-atlas/atlas/issues/143#issuecomment-463103413, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AtYbXUFOc1DTe1RTi56fV6X6udRV__Cgks5vM8tFgaJpZM4Xsvhl.

SilasK commented 5 years ago

OK, I see then with rna-spades I should use the transcripts.fastawith RNA spades. If it worked with megahit.

Do you know if prodigal can predict genes from transcripts?

Can you try to run atlas run gene catalog

with

assembler: megahit

genecatalog:
  source: contigs
MCamp91 commented 5 years ago

Hello,

I did the run genecatalog but ran into this issue (see below).

Looks like its still trying to bin the contigs...

These were the inputs I used..

run with config parameters set to:

assembler: megahit

genecatalog: source: contigs

running script - nohup /data/work/atlas_scripts/run.sh genecatalog &

Cheers,

Matt

40 of 83 steps (48%) done

[Thu Feb 14 04:09:40 2019] rule maxbin: input: MM2-SB/MM2-SB_contigs.fasta, MM2-SB/binning/coverage/MM2-SB_coverage.txt output: MM2-SB/binning/maxbin/intermediate_files log: MM2-SB/logs/binning/maxbin.log jobid: 60 wildcards: sample=MM2-SB threads: 16

    mkdir MM2-SB/binning/maxbin/intermediate_files 2> MM2-SB/logs/binning/maxbin.log
    run_MaxBin.pl -contig MM2-SB/MM2-SB_contigs.fasta             -abund MM2-SB/binning/coverage/MM2-SB_coverage.txt             -out MM2-SB/binning/maxbin/intermediate_files/MM2-SB             -min_contig_length 1000             -thread 16             -prob_threshold 0.9             -max_iteration 50 >> MM2-SB/logs/binning/maxbin.log

    mv MM2-SB/binning/maxbin/intermediate_files/MM2-SB.summary MM2-SB/binning/maxbin/intermediate_files/.. 2>> MM2-SB/logs/binning/maxbin.log
    mv MM2-SB/binning/maxbin/intermediate_files/MM2-SB.marker MM2-SB/binning/maxbin/intermediate_files/..  2>> MM2-SB/logs/binning/maxbin.log
    mv MM2-SB/binning/maxbin/intermediate_files/MM2-SB.marker_of_each_bin.tar.gz MM2-SB/binning/maxbin/intermediate_files/..  2>> MM2-SB/logs/binning/maxbin.log
    mv MM2-SB/binning/maxbin/intermediate_files/MM2-SB.log MM2-SB/binning/maxbin/intermediate_files/..  2>> MM2-SB/logs/binning/maxbin.log

Activating conda environment: /databases/conda_envs/189bcd05 [Thu Feb 14 04:09:46 2019] Error in rule maxbin: jobid: 60 output: MM2-SB/binning/maxbin/intermediate_files log: MM2-SB/logs/binning/maxbin.log conda-env: /databases/conda_envs/189bcd05

RuleException: CalledProcessError in line 212 of /opt/conda/lib/python3.6/site-packages/atlas/rules/binning.snakefile: Command 'source activate /databases/conda_envs/189bcd05; set -euo pipefail; mkdir MM2-SB/binning/maxbin/intermediate_files 2> MM2-SB/logs/binning/maxbin.log run_MaxBin.pl -contig MM2-SB/MM2-SB_contigs.fasta -abund MM2-SB/binning/coverage/MM2-SB_coverage.txt -out MM2-SB/binning/maxbin/intermediate_files/MM2-SB -min_contig_length 1000 -thread 16 -prob_threshold 0.9 -max_iteration 50 >> MM2-SB/logs/binning/maxbin.log

    mv MM2-SB/binning/maxbin/intermediate_files/MM2-SB.summary MM2-SB/binning/maxbin/intermediate_files/.. 2>> MM2-SB/logs/binning/maxbin.log
    mv MM2-SB/binning/maxbin/intermediate_files/MM2-SB.marker MM2-SB/binning/maxbin/intermediate_files/..  2>> MM2-SB/logs/binning/maxbin.log
    mv MM2-SB/binning/maxbin/intermediate_files/MM2-SB.marker_of_each_bin.tar.gz MM2-SB/binning/maxbin/intermediate_files/..  2>> MM2-SB/logs/binning/maxbin.log
    mv MM2-SB/binning/maxbin/intermediate_files/MM2-SB.log MM2-SB/binning/maxbin/intermediate_files/..  2>> MM2-SB/logs/binning/maxbin.log ' returned non-zero exit status 255.

File "/opt/conda/lib/python3.6/site-packages/atlas/rules/binning.snakefile", line 212, in __rule_maxbin File "/opt/conda/lib/python3.6/concurrent/futures/thread.py", line 56, in run Removing output files of failed job maxbin since they might be corrupted: MM2-SB/binning/maxbin/intermediate_files Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message Complete log: /data/work/24samples/atlas_201_genecat_test/.snakemake/log/2019-02-14T024844.817097.snakemake.log Note the path to the log file for debugging. Documentation is available at: https://metagenome-atlas.readthedocs.io Issues can be raised at: https://github.com/metagenome-atlas/atlas/issues [2019-02-14 04:09 CRITICAL] Command 'snakemake --snakefile /opt/conda/lib/python3.6/site-packages/atlas/Snakefile --directory /data/work/24samples/atlas_201_genecat_test --printshellcmds --jobs 16 --rerun-incomplete --configfile '/data/work/24samples/atlas_201_genecat_test/config.yaml' --nolock --use-conda --conda-prefix /databases/conda_envs genecatalog ' returned non-zero exit status 1


From: Silas Kieser notifications@github.com Sent: Wednesday, 13 February 2019 11:41:54 PM To: metagenome-atlas/atlas Cc: Matthew Campbell; Mention Subject: Re: [metagenome-atlas/atlas] Atlas for metatranscriptomes (#143)

OK, I see then with rna-spades I should use the transcripts.fastawith RNA spades. If it worked with megahit.

Do you know if prodigal can predict genes from transcripts?

Can you try to run atlas run gene catalog

with

assembler: megahit

genecatalog: source: contigs

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/metagenome-atlas/atlas/issues/143#issuecomment-463247757, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AtYbXbvBM_AKv5zy0EW83tHb7gJh9R70ks5vNDJCgaJpZM4Xsvhl.

MCamp91 commented 5 years ago

Hello,

Just to simplify my previous email.

The output file /data/work/24samples/atlas_201_genecat_test/out_run_1, this is the task that fails:

rule maxbin: input: MM2-SB/MM2-SB_contigs.fasta, MM2-SB/binning/coverage/MM2-SB_coverage.txt output: MM2-SB/binning/maxbin/intermediate_files log: MM2-SB/logs/binning/maxbin.log jobid: 60 wildcards: sample=MM2-SB threads: 16

so it looks like the input files are indeed the assembled contigs.

Can I restart the run just by re-executing any atlas command in the same folder? I’m not sure how I can do this going off the current documentation for atlas.

Another question … I’ve tried running my RNAseq data as metagenomes but had issues with the binning. Is there any thing I could change in the config file to increase chances of binning?

Cheers,

Matt


From: Matthew Campbell Sent: Thursday, 14 February 2019 1:13:59 PM To: metagenome-atlas/atlas; metagenome-atlas/atlas Cc: Mention Subject: Re: [metagenome-atlas/atlas] Atlas for metatranscriptomes (#143)

Hello,

I did the run genecatalog but ran into this issue (see below).

Looks like its still trying to bin the contigs...

These were the inputs I used..

run with config parameters set to:

assembler: megahit

genecatalog: source: contigs

running script - nohup /data/work/atlas_scripts/run.sh genecatalog &

Cheers,

Matt

40 of 83 steps (48%) done

[Thu Feb 14 04:09:40 2019] rule maxbin: input: MM2-SB/MM2-SB_contigs.fasta, MM2-SB/binning/coverage/MM2-SB_coverage.txt output: MM2-SB/binning/maxbin/intermediate_files log: MM2-SB/logs/binning/maxbin.log jobid: 60 wildcards: sample=MM2-SB threads: 16

    mkdir MM2-SB/binning/maxbin/intermediate_files 2> MM2-SB/logs/binning/maxbin.log
    run_MaxBin.pl -contig MM2-SB/MM2-SB_contigs.fasta             -abund MM2-SB/binning/coverage/MM2-SB_coverage.txt             -out MM2-SB/binning/maxbin/intermediate_files/MM2-SB             -min_contig_length 1000             -thread 16             -prob_threshold 0.9             -max_iteration 50 >> MM2-SB/logs/binning/maxbin.log

    mv MM2-SB/binning/maxbin/intermediate_files/MM2-SB.summary MM2-SB/binning/maxbin/intermediate_files/.. 2>> MM2-SB/logs/binning/maxbin.log
    mv MM2-SB/binning/maxbin/intermediate_files/MM2-SB.marker MM2-SB/binning/maxbin/intermediate_files/..  2>> MM2-SB/logs/binning/maxbin.log
    mv MM2-SB/binning/maxbin/intermediate_files/MM2-SB.marker_of_each_bin.tar.gz MM2-SB/binning/maxbin/intermediate_files/..  2>> MM2-SB/logs/binning/maxbin.log
    mv MM2-SB/binning/maxbin/intermediate_files/MM2-SB.log MM2-SB/binning/maxbin/intermediate_files/..  2>> MM2-SB/logs/binning/maxbin.log

Activating conda environment: /databases/conda_envs/189bcd05 [Thu Feb 14 04:09:46 2019] Error in rule maxbin: jobid: 60 output: MM2-SB/binning/maxbin/intermediate_files log: MM2-SB/logs/binning/maxbin.log conda-env: /databases/conda_envs/189bcd05

RuleException: CalledProcessError in line 212 of /opt/conda/lib/python3.6/site-packages/atlas/rules/binning.snakefile: Command 'source activate /databases/conda_envs/189bcd05; set -euo pipefail; mkdir MM2-SB/binning/maxbin/intermediate_files 2> MM2-SB/logs/binning/maxbin.log run_MaxBin.pl -contig MM2-SB/MM2-SB_contigs.fasta -abund MM2-SB/binning/coverage/MM2-SB_coverage.txt -out MM2-SB/binning/maxbin/intermediate_files/MM2-SB -min_contig_length 1000 -thread 16 -prob_threshold 0.9 -max_iteration 50 >> MM2-SB/logs/binning/maxbin.log

    mv MM2-SB/binning/maxbin/intermediate_files/MM2-SB.summary MM2-SB/binning/maxbin/intermediate_files/.. 2>> MM2-SB/logs/binning/maxbin.log
    mv MM2-SB/binning/maxbin/intermediate_files/MM2-SB.marker MM2-SB/binning/maxbin/intermediate_files/..  2>> MM2-SB/logs/binning/maxbin.log
    mv MM2-SB/binning/maxbin/intermediate_files/MM2-SB.marker_of_each_bin.tar.gz MM2-SB/binning/maxbin/intermediate_files/..  2>> MM2-SB/logs/binning/maxbin.log
    mv MM2-SB/binning/maxbin/intermediate_files/MM2-SB.log MM2-SB/binning/maxbin/intermediate_files/..  2>> MM2-SB/logs/binning/maxbin.log ' returned non-zero exit status 255.

File "/opt/conda/lib/python3.6/site-packages/atlas/rules/binning.snakefile", line 212, in __rule_maxbin File "/opt/conda/lib/python3.6/concurrent/futures/thread.py", line 56, in run Removing output files of failed job maxbin since they might be corrupted: MM2-SB/binning/maxbin/intermediate_files Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message Complete log: /data/work/24samples/atlas_201_genecat_test/.snakemake/log/2019-02-14T024844.817097.snakemake.log Note the path to the log file for debugging. Documentation is available at: https://metagenome-atlas.readthedocs.io Issues can be raised at: https://github.com/metagenome-atlas/atlas/issues [2019-02-14 04:09 CRITICAL] Command 'snakemake --snakefile /opt/conda/lib/python3.6/site-packages/atlas/Snakefile --directory /data/work/24samples/atlas_201_genecat_test --printshellcmds --jobs 16 --rerun-incomplete --configfile '/data/work/24samples/atlas_201_genecat_test/config.yaml' --nolock --use-conda --conda-prefix /databases/conda_envs genecatalog ' returned non-zero exit status 1


From: Silas Kieser notifications@github.com Sent: Wednesday, 13 February 2019 11:41:54 PM To: metagenome-atlas/atlas Cc: Matthew Campbell; Mention Subject: Re: [metagenome-atlas/atlas] Atlas for metatranscriptomes (#143)

OK, I see then with rna-spades I should use the transcripts.fastawith RNA spades. If it worked with megahit.

Do you know if prodigal can predict genes from transcripts?

Can you try to run atlas run gene catalog

with

assembler: megahit

genecatalog: source: contigs

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/metagenome-atlas/atlas/issues/143#issuecomment-463247757, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AtYbXbvBM_AKv5zy0EW83tHb7gJh9R70ks5vNDJCgaJpZM4Xsvhl.

SilasK commented 5 years ago

The genecatlog and binning are linked. which for metagenomic samples assures that you don't continue with having forgotten something.

In your case try:

atlas run None "Genecatalog/all_genes_unfiltered.faa"

to predict the genes and concatenate them. . atlas run None runs atlas with no specific workflow.

If this works you can run:

atlas run None "Genecatalog/gene_catalog.faa"

to get the clustered gene catalog.

"Genecatalog/counts/median_coverage.tsv.gz"

To get coverages

and/or

atlas run None "Genecatalog/annotations/eggNog.tsv"

for annotation.

SilasK commented 5 years ago

For your other questions:

Can I restart the run just by re-executing any atlas command in the same folder? Yes, you can and should run atlas in the same folder if you use the same assembler.

The difference between metatranscriptome and metagenome are mainly on the level of contaminant removal. 'rRNA' is not considered as a contaminant for metagenomes. But I'm not sure if you have added them to be removed.

How to cluster metatranscriptomes?

Can you tell me what you have for data and what you want to achieve? why not on slack . maybe final_binner: concoct works better for that.

MCamp91 commented 5 years ago

Hello,

I've been trying to run atlas in a few different configurations in regards to the the QC and assemblers, but i'm running into these issue (see attached).

Not sure if it's issue with the pipeline or the quality of my samples? Is it possible to change some of the parameters of rnaSpades i.e contig length?

cheers


From: Silas Kieser notifications@github.com Sent: Friday, 15 February 2019 12:10:47 AM To: metagenome-atlas/atlas Cc: Matthew Campbell; Mention Subject: Re: [metagenome-atlas/atlas] Atlas for metatranscriptomes (#143)

For your other questions:

Can I restart the run just by re-executing any atlas command in the same folder? Yes, you can and should run atlas in the same folder if you use the same assembler.

The difference for metatranscriptome and metagenome are mainly on the level of contaminant removal. 'rRNA' is not considered as a contaminant for metagenomes. But I'm not sure if you have added them to be removed.

How to cluster metatranscriptomes?

Can you tell me what you have for data and what you want to achive? why not on slack . may be final_binner: concoct works better for that.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/metagenome-atlas/atlas/issues/143#issuecomment-463686727, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AtYbXbXpVYdpe3yj9zwMQWVjE1oCLPztks5vNYqGgaJpZM4Xsvhl.

Dynamic output is deprecated in favor of checkpoints (see docs). It will be removed in Snakemake 6.0. Dynamic output is deprecated in favor of checkpoints (see docs). It will be removed in Snakemake 6.0. Dynamic output is deprecated in favor of checkpoints (see docs). It will be removed in Snakemake 6.0. Dynamic output is deprecated in favor of checkpoints (see docs). It will be removed in Snakemake 6.0. Dynamic output is deprecated in favor of checkpoints (see docs). It will be removed in Snakemake 6.0. Dynamic output is deprecated in favor of checkpoints (see docs). It will be removed in Snakemake 6.0. Dynamic output is deprecated in favor of checkpoints (see docs). It will be removed in Snakemake 6.0. Building DAG of jobs... Using shell: /bin/bash Provided cores: 16 Rules claiming more threads will be scaled down. Unlimited resources: mem, java_mem Job counts: count jobs 7 align_reads_to_MAGs 7 align_reads_to_final_contigs 7 align_reads_to_prefilter_contigs 1 all 7 apply_quality_filter 1 assembly 7 assembly_one_sample 7 bam_2_sam_MAGs 7 bam_2_sam_binning 7 bam_2_sam_contigs 1 binning 1 build_assembly_report 1 build_bin_report 1 build_db_genomes 1 build_decontamination_db 1 build_qc_report 14 calculate_contigs_stats 7 calculate_insert_size 1 cat_get_name 1 cat_on_bin 1 cluster_genes 1 combine_annotations 1 combine_bined_coverages_MAGs 7 combine_coverages 1 combine_coverages_MAGs 1 combine_insert_stats 1 combine_read_counts 1 combine_read_length_stats 1 concat_genes 7 convert_concoct_csv_to_tsv 14 convert_sam_to_bam 7 deduplicate_reads 1 download_cat_db 1 download_eggNOG_fastas 3 download_eggNOG_files 1 eggNOG_annotation 1 eggNOG_homology_search 7 error_correction 7 filter_by_coverage 1 filter_genes 7 finalize_contigs 7 finalize_sample_qc 7 find_16S 1 first_dereplication 1 gene_subsets 1 genecatalog 1 genomes 1 get_all_16S 1 get_all_bins 7 get_bins 7 get_contig_coverage_from_bb 7 get_contigs_from_gene_names 1 get_genome_for_cat 1 get_genomes2cluster 7 get_maxbin_cluster_attribution 7 get_metabat_depth_file 1 get_quality_for_dRep_from_checkm 35 get_read_stats 1 get_rep_proteins 21 get_unique_bin_ids 21 get_unique_cluster_attribution 21 init_pre_assembly_processing 1 initialize_checkm 7 initialize_qc 7 maxbin 7 merge_pairs 1 merge_taxonomy 7 metabat 7 pileup 7 pileup_MAGs 7 pileup_for_binning 7 pileup_prefilter 7 predict_genes 1 predict_genes_genomes 1 qc 7 qcreads 7 rename_contigs 1 rename_gene_catalog 1 rename_genomes 7 rename_megahit_output 1 rename_protein_catalog 1 run_all_checkm_lineage_wf 7 run_checkm_lineage_wf 8 run_checkm_tree_qa 7 run_concoct 7 run_das_tool 7 run_decontamination 7 run_megahit 1 second_dereplication 7 write_read_counts 459

[Tue Mar 5 05:34:36 2019] rule initialize_qc: input: /data/work/24samples/RRBS_Metagenomes/MM14_SB_R1.fastq.gz, /data/work/24samples/RRBS_Metagenomes/MM14_SB_R2.fastq.gz output: MM14-SB/sequence_quality_control/MM14-SB_raw_R1.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_raw_R2.fastq.gz log: MM14-SB/logs/QC/init.log jobid: 222 wildcards: sample=MM14-SB priority: 80 threads: 16 resources: mem=32, java_mem=27

    reformat.sh in=/data/work/24samples/RRBS_Metagenomes/MM14_SB_R1.fastq.gz in2=/data/work/24samples/RRBS_Metagenomes/MM14_SB_R2.fastq.gz             interleaved=f             out1=MM14-SB/sequence_quality_control/MM14-SB_raw_R1.fastq.gz out2=MM14-SB/sequence_quality_control/MM14-SB_raw_R2.fastq.gz             iupacToN=t             touppercase=t             qout=33             overwrite=true             verifypaired=t             addslash=t             trimreaddescription=t             threads=16             -Xmx27G 2> MM14-SB/logs/QC/init.log

Activating conda environment: /databases/conda_envs/81c7173e [Tue Mar 5 05:36:31 2019] Finished job 222. 1 of 459 steps (0.22%) done

[Tue Mar 5 05:36:31 2019] rule initialize_qc: input: /data/work/24samples/RRBS_Metagenomes/MM13_SB_R1.fastq.gz, /data/work/24samples/RRBS_Metagenomes/MM13_SB_R2.fastq.gz output: MM13-SB/sequence_quality_control/MM13-SB_raw_R1.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_raw_R2.fastq.gz log: MM13-SB/logs/QC/init.log jobid: 216 wildcards: sample=MM13-SB priority: 80 threads: 16 resources: mem=32, java_mem=27

    reformat.sh in=/data/work/24samples/RRBS_Metagenomes/MM13_SB_R1.fastq.gz in2=/data/work/24samples/RRBS_Metagenomes/MM13_SB_R2.fastq.gz             interleaved=f             out1=MM13-SB/sequence_quality_control/MM13-SB_raw_R1.fastq.gz out2=MM13-SB/sequence_quality_control/MM13-SB_raw_R2.fastq.gz             iupacToN=t             touppercase=t             qout=33             overwrite=true             verifypaired=t             addslash=t             trimreaddescription=t             threads=16             -Xmx27G 2> MM13-SB/logs/QC/init.log

Activating conda environment: /databases/conda_envs/81c7173e [Tue Mar 5 05:38:26 2019] Finished job 216. 2 of 459 steps (0.44%) done

[Tue Mar 5 05:38:26 2019] rule initialize_qc: input: /data/work/24samples/RRBS_Metagenomes/MM15_SB_R1.fastq.gz, /data/work/24samples/RRBS_Metagenomes/MM15_SB_R2.fastq.gz output: MM15-SB/sequence_quality_control/MM15-SB_raw_R1.fastq.gz, MM15-SB/sequence_quality_control/MM15-SB_raw_R2.fastq.gz log: MM15-SB/logs/QC/init.log jobid: 231 wildcards: sample=MM15-SB priority: 80 threads: 16 resources: mem=32, java_mem=27

    reformat.sh in=/data/work/24samples/RRBS_Metagenomes/MM15_SB_R1.fastq.gz in2=/data/work/24samples/RRBS_Metagenomes/MM15_SB_R2.fastq.gz             interleaved=f             out1=MM15-SB/sequence_quality_control/MM15-SB_raw_R1.fastq.gz out2=MM15-SB/sequence_quality_control/MM15-SB_raw_R2.fastq.gz             iupacToN=t             touppercase=t             qout=33             overwrite=true             verifypaired=t             addslash=t             trimreaddescription=t             threads=16             -Xmx27G 2> MM15-SB/logs/QC/init.log

Activating conda environment: /databases/conda_envs/81c7173e [Tue Mar 5 05:40:17 2019] Finished job 231. 3 of 459 steps (0.65%) done

[Tue Mar 5 05:40:17 2019] rule initialize_qc: input: /data/work/24samples/RRBS_Metagenomes/MM24_B_R1.fastq.gz, /data/work/24samples/RRBS_Metagenomes/MM24_B_R2.fastq.gz output: MM24-B/sequence_quality_control/MM24-B_raw_R1.fastq.gz, MM24-B/sequence_quality_control/MM24-B_raw_R2.fastq.gz log: MM24-B/logs/QC/init.log jobid: 234 wildcards: sample=MM24-B priority: 80 threads: 16 resources: mem=32, java_mem=27

    reformat.sh in=/data/work/24samples/RRBS_Metagenomes/MM24_B_R1.fastq.gz in2=/data/work/24samples/RRBS_Metagenomes/MM24_B_R2.fastq.gz             interleaved=f             out1=MM24-B/sequence_quality_control/MM24-B_raw_R1.fastq.gz out2=MM24-B/sequence_quality_control/MM24-B_raw_R2.fastq.gz             iupacToN=t             touppercase=t             qout=33             overwrite=true             verifypaired=t             addslash=t             trimreaddescription=t             threads=16             -Xmx27G 2> MM24-B/logs/QC/init.log

Activating conda environment: /databases/conda_envs/81c7173e [Tue Mar 5 05:42:06 2019] Finished job 234. 4 of 459 steps (0.87%) done

[Tue Mar 5 05:42:06 2019] rule initialize_qc: input: /data/work/24samples/RRBS_Metagenomes/MM11_SB_R1.fastq.gz, /data/work/24samples/RRBS_Metagenomes/MM11_SB_R2.fastq.gz output: MM11-SB/sequence_quality_control/MM11-SB_raw_R1.fastq.gz, MM11-SB/sequence_quality_control/MM11-SB_raw_R2.fastq.gz log: MM11-SB/logs/QC/init.log jobid: 225 wildcards: sample=MM11-SB priority: 80 threads: 16 resources: mem=32, java_mem=27

    reformat.sh in=/data/work/24samples/RRBS_Metagenomes/MM11_SB_R1.fastq.gz in2=/data/work/24samples/RRBS_Metagenomes/MM11_SB_R2.fastq.gz             interleaved=f             out1=MM11-SB/sequence_quality_control/MM11-SB_raw_R1.fastq.gz out2=MM11-SB/sequence_quality_control/MM11-SB_raw_R2.fastq.gz             iupacToN=t             touppercase=t             qout=33             overwrite=true             verifypaired=t             addslash=t             trimreaddescription=t             threads=16             -Xmx27G 2> MM11-SB/logs/QC/init.log

Activating conda environment: /databases/conda_envs/81c7173e [Tue Mar 5 05:43:47 2019] Finished job 225. 5 of 459 steps (1%) done

[Tue Mar 5 05:43:47 2019] rule initialize_qc: input: /data/work/24samples/RRBS_Metagenomes/MM10_SB_R1.fastq.gz, /data/work/24samples/RRBS_Metagenomes/MM10_SB_R2.fastq.gz output: MM10-SB/sequence_quality_control/MM10-SB_raw_R1.fastq.gz, MM10-SB/sequence_quality_control/MM10-SB_raw_R2.fastq.gz log: MM10-SB/logs/QC/init.log jobid: 228 wildcards: sample=MM10-SB priority: 80 threads: 16 resources: mem=32, java_mem=27

    reformat.sh in=/data/work/24samples/RRBS_Metagenomes/MM10_SB_R1.fastq.gz in2=/data/work/24samples/RRBS_Metagenomes/MM10_SB_R2.fastq.gz             interleaved=f             out1=MM10-SB/sequence_quality_control/MM10-SB_raw_R1.fastq.gz out2=MM10-SB/sequence_quality_control/MM10-SB_raw_R2.fastq.gz             iupacToN=t             touppercase=t             qout=33             overwrite=true             verifypaired=t             addslash=t             trimreaddescription=t             threads=16             -Xmx27G 2> MM10-SB/logs/QC/init.log

Activating conda environment: /databases/conda_envs/81c7173e [Tue Mar 5 05:45:31 2019] Finished job 228. 6 of 459 steps (1%) done

[Tue Mar 5 05:45:31 2019] rule initialize_qc: input: /data/work/24samples/RRBS_Metagenomes/MM12_SB_R1.fastq.gz, /data/work/24samples/RRBS_Metagenomes/MM12_SB_R2.fastq.gz output: MM12-SB/sequence_quality_control/MM12-SB_raw_R1.fastq.gz, MM12-SB/sequence_quality_control/MM12-SB_raw_R2.fastq.gz log: MM12-SB/logs/QC/init.log jobid: 219 wildcards: sample=MM12-SB priority: 80 threads: 16 resources: mem=32, java_mem=27

    reformat.sh in=/data/work/24samples/RRBS_Metagenomes/MM12_SB_R1.fastq.gz in2=/data/work/24samples/RRBS_Metagenomes/MM12_SB_R2.fastq.gz             interleaved=f             out1=MM12-SB/sequence_quality_control/MM12-SB_raw_R1.fastq.gz out2=MM12-SB/sequence_quality_control/MM12-SB_raw_R2.fastq.gz             iupacToN=t             touppercase=t             qout=33             overwrite=true             verifypaired=t             addslash=t             trimreaddescription=t             threads=16             -Xmx27G 2> MM12-SB/logs/QC/init.log

Activating conda environment: /databases/conda_envs/81c7173e [Tue Mar 5 05:47:04 2019] Finished job 219. 7 of 459 steps (2%) done

[Tue Mar 5 05:47:04 2019] rule get_read_stats: input: MM14-SB/sequence_quality_control/MM14-SB_raw_R1.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_raw_R2.fastq.gz output: MM14-SB/sequence_quality_control/read_stats/raw.zip, MM14-SB/sequence_quality_control/read_stats/raw_read_counts.tsv log: MM14-SB/logs/QC/read_stats/raw.log jobid: 60 wildcards: sample=MM14-SB, step=raw priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 05:49:20 2019] Finished job 60. 8 of 459 steps (2%) done

[Tue Mar 5 05:49:20 2019] rule get_read_stats: input: MM13-SB/sequence_quality_control/MM13-SB_raw_R1.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_raw_R2.fastq.gz output: MM13-SB/sequence_quality_control/read_stats/raw.zip, MM13-SB/sequence_quality_control/read_stats/raw_read_counts.tsv log: MM13-SB/logs/QC/read_stats/raw.log jobid: 44 wildcards: sample=MM13-SB, step=raw priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 05:51:27 2019] Finished job 44. 9 of 459 steps (2%) done

[Tue Mar 5 05:51:27 2019] rule get_read_stats: input: MM15-SB/sequence_quality_control/MM15-SB_raw_R1.fastq.gz, MM15-SB/sequence_quality_control/MM15-SB_raw_R2.fastq.gz output: MM15-SB/sequence_quality_control/read_stats/raw.zip, MM15-SB/sequence_quality_control/read_stats/raw_read_counts.tsv log: MM15-SB/logs/QC/read_stats/raw.log jobid: 84 wildcards: sample=MM15-SB, step=raw priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 05:53:34 2019] Finished job 84. 10 of 459 steps (2%) done

[Tue Mar 5 05:53:34 2019] rule get_read_stats: input: MM24-B/sequence_quality_control/MM24-B_raw_R1.fastq.gz, MM24-B/sequence_quality_control/MM24-B_raw_R2.fastq.gz output: MM24-B/sequence_quality_control/read_stats/raw.zip, MM24-B/sequence_quality_control/read_stats/raw_read_counts.tsv log: MM24-B/logs/QC/read_stats/raw.log jobid: 92 wildcards: sample=MM24-B, step=raw priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 05:55:39 2019] Finished job 92. 11 of 459 steps (2%) done

[Tue Mar 5 05:55:39 2019] rule get_read_stats: input: MM11-SB/sequence_quality_control/MM11-SB_raw_R1.fastq.gz, MM11-SB/sequence_quality_control/MM11-SB_raw_R2.fastq.gz output: MM11-SB/sequence_quality_control/read_stats/raw.zip, MM11-SB/sequence_quality_control/read_stats/raw_read_counts.tsv log: MM11-SB/logs/QC/read_stats/raw.log jobid: 68 wildcards: sample=MM11-SB, step=raw priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 05:57:36 2019] Finished job 68. 12 of 459 steps (3%) done

[Tue Mar 5 05:57:36 2019] rule get_read_stats: input: MM10-SB/sequence_quality_control/MM10-SB_raw_R1.fastq.gz, MM10-SB/sequence_quality_control/MM10-SB_raw_R2.fastq.gz output: MM10-SB/sequence_quality_control/read_stats/raw.zip, MM10-SB/sequence_quality_control/read_stats/raw_read_counts.tsv log: MM10-SB/logs/QC/read_stats/raw.log jobid: 76 wildcards: sample=MM10-SB, step=raw priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 05:59:30 2019] Finished job 76. 13 of 459 steps (3%) done

[Tue Mar 5 05:59:30 2019] rule get_read_stats: input: MM12-SB/sequence_quality_control/MM12-SB_raw_R1.fastq.gz, MM12-SB/sequence_quality_control/MM12-SB_raw_R2.fastq.gz output: MM12-SB/sequence_quality_control/read_stats/raw.zip, MM12-SB/sequence_quality_control/read_stats/raw_read_counts.tsv log: MM12-SB/logs/QC/read_stats/raw.log jobid: 52 wildcards: sample=MM12-SB, step=raw priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 06:01:11 2019] Finished job 52. 14 of 459 steps (3%) done

[Tue Mar 5 06:01:11 2019] rule deduplicate_reads: input: MM14-SB/sequence_quality_control/MM14-SB_raw_R1.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_raw_R2.fastq.gz output: MM14-SB/sequence_quality_control/MM14-SB_deduplicated_R1.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_deduplicated_R2.fastq.gz log: MM14-SB/logs/QC/deduplicate.log jobid: 221 benchmark: logs/benchmarks/QC/deduplicate/MM14-SB.txt wildcards: sample=MM14-SB threads: 16 resources: mem=32, java_mem=27

        clumpify.sh                 in=MM14-SB/sequence_quality_control/MM14-SB_raw_R1.fastq.gz in2=MM14-SB/sequence_quality_control/MM14-SB_raw_R2.fastq.gz                 out1=MM14-SB/sequence_quality_control/MM14-SB_deduplicated_R1.fastq.gz out2=MM14-SB/sequence_quality_control/MM14-SB_deduplicated_R2.fastq.gz                 overwrite=true                dedupe=t                 dupesubs=2                 optical=f                threads=16                 -Xmx27G 2> MM14-SB/logs/QC/deduplicate.log

Activating conda environment: /databases/conda_envs/81c7173e Removing temporary output file MM14-SB/sequence_quality_control/MM14-SB_raw_R1.fastq.gz. Removing temporary output file MM14-SB/sequence_quality_control/MM14-SB_raw_R2.fastq.gz. [Tue Mar 5 06:03:46 2019] Finished job 221. 15 of 459 steps (3%) done

[Tue Mar 5 06:03:46 2019] rule get_read_stats: input: MM14-SB/sequence_quality_control/MM14-SB_deduplicated_R1.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_deduplicated_R2.fastq.gz output: MM14-SB/sequence_quality_control/read_stats/deduplicated.zip, MM14-SB/sequence_quality_control/read_stats/deduplicated_read_counts.tsv log: MM14-SB/logs/QC/read_stats/deduplicated.log jobid: 61 wildcards: sample=MM14-SB, step=deduplicated priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 06:05:56 2019] Finished job 61. 16 of 459 steps (3%) done

[Tue Mar 5 06:05:56 2019] rule deduplicate_reads: input: MM13-SB/sequence_quality_control/MM13-SB_raw_R1.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_raw_R2.fastq.gz output: MM13-SB/sequence_quality_control/MM13-SB_deduplicated_R1.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_deduplicated_R2.fastq.gz log: MM13-SB/logs/QC/deduplicate.log jobid: 214 benchmark: logs/benchmarks/QC/deduplicate/MM13-SB.txt wildcards: sample=MM13-SB threads: 16 resources: mem=32, java_mem=27

        clumpify.sh                 in=MM13-SB/sequence_quality_control/MM13-SB_raw_R1.fastq.gz in2=MM13-SB/sequence_quality_control/MM13-SB_raw_R2.fastq.gz                 out1=MM13-SB/sequence_quality_control/MM13-SB_deduplicated_R1.fastq.gz out2=MM13-SB/sequence_quality_control/MM13-SB_deduplicated_R2.fastq.gz                 overwrite=true                dedupe=t                 dupesubs=2                 optical=f                threads=16                 -Xmx27G 2> MM13-SB/logs/QC/deduplicate.log

Activating conda environment: /databases/conda_envs/81c7173e Removing temporary output file MM13-SB/sequence_quality_control/MM13-SB_raw_R1.fastq.gz. Removing temporary output file MM13-SB/sequence_quality_control/MM13-SB_raw_R2.fastq.gz. [Tue Mar 5 06:08:13 2019] Finished job 214. 17 of 459 steps (4%) done

[Tue Mar 5 06:08:13 2019] rule get_read_stats: input: MM13-SB/sequence_quality_control/MM13-SB_deduplicated_R1.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_deduplicated_R2.fastq.gz output: MM13-SB/sequence_quality_control/read_stats/deduplicated.zip, MM13-SB/sequence_quality_control/read_stats/deduplicated_read_counts.tsv log: MM13-SB/logs/QC/read_stats/deduplicated.log jobid: 45 wildcards: sample=MM13-SB, step=deduplicated priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 06:10:14 2019] Finished job 45. 18 of 459 steps (4%) done

[Tue Mar 5 06:10:14 2019] rule deduplicate_reads: input: MM15-SB/sequence_quality_control/MM15-SB_raw_R1.fastq.gz, MM15-SB/sequence_quality_control/MM15-SB_raw_R2.fastq.gz output: MM15-SB/sequence_quality_control/MM15-SB_deduplicated_R1.fastq.gz, MM15-SB/sequence_quality_control/MM15-SB_deduplicated_R2.fastq.gz log: MM15-SB/logs/QC/deduplicate.log jobid: 230 benchmark: logs/benchmarks/QC/deduplicate/MM15-SB.txt wildcards: sample=MM15-SB threads: 16 resources: mem=32, java_mem=27

        clumpify.sh                 in=MM15-SB/sequence_quality_control/MM15-SB_raw_R1.fastq.gz in2=MM15-SB/sequence_quality_control/MM15-SB_raw_R2.fastq.gz                 out1=MM15-SB/sequence_quality_control/MM15-SB_deduplicated_R1.fastq.gz out2=MM15-SB/sequence_quality_control/MM15-SB_deduplicated_R2.fastq.gz                 overwrite=true                dedupe=t                 dupesubs=2                 optical=f                threads=16                 -Xmx27G 2> MM15-SB/logs/QC/deduplicate.log

Activating conda environment: /databases/conda_envs/81c7173e Removing temporary output file MM15-SB/sequence_quality_control/MM15-SB_raw_R1.fastq.gz. Removing temporary output file MM15-SB/sequence_quality_control/MM15-SB_raw_R2.fastq.gz. [Tue Mar 5 06:12:21 2019] Finished job 230. 19 of 459 steps (4%) done

[Tue Mar 5 06:12:21 2019] rule get_read_stats: input: MM15-SB/sequence_quality_control/MM15-SB_deduplicated_R1.fastq.gz, MM15-SB/sequence_quality_control/MM15-SB_deduplicated_R2.fastq.gz output: MM15-SB/sequence_quality_control/read_stats/deduplicated.zip, MM15-SB/sequence_quality_control/read_stats/deduplicated_read_counts.tsv log: MM15-SB/logs/QC/read_stats/deduplicated.log jobid: 85 wildcards: sample=MM15-SB, step=deduplicated priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 06:14:22 2019] Finished job 85. 20 of 459 steps (4%) done

[Tue Mar 5 06:14:22 2019] rule deduplicate_reads: input: MM24-B/sequence_quality_control/MM24-B_raw_R1.fastq.gz, MM24-B/sequence_quality_control/MM24-B_raw_R2.fastq.gz output: MM24-B/sequence_quality_control/MM24-B_deduplicated_R1.fastq.gz, MM24-B/sequence_quality_control/MM24-B_deduplicated_R2.fastq.gz log: MM24-B/logs/QC/deduplicate.log jobid: 233 benchmark: logs/benchmarks/QC/deduplicate/MM24-B.txt wildcards: sample=MM24-B threads: 16 resources: mem=32, java_mem=27

        clumpify.sh                 in=MM24-B/sequence_quality_control/MM24-B_raw_R1.fastq.gz in2=MM24-B/sequence_quality_control/MM24-B_raw_R2.fastq.gz                 out1=MM24-B/sequence_quality_control/MM24-B_deduplicated_R1.fastq.gz out2=MM24-B/sequence_quality_control/MM24-B_deduplicated_R2.fastq.gz                 overwrite=true                dedupe=t                 dupesubs=2                 optical=f                threads=16                 -Xmx27G 2> MM24-B/logs/QC/deduplicate.log

Activating conda environment: /databases/conda_envs/81c7173e Removing temporary output file MM24-B/sequence_quality_control/MM24-B_raw_R2.fastq.gz. Removing temporary output file MM24-B/sequence_quality_control/MM24-B_raw_R1.fastq.gz. [Tue Mar 5 06:15:57 2019] Finished job 233. 21 of 459 steps (5%) done

[Tue Mar 5 06:15:57 2019] rule get_read_stats: input: MM24-B/sequence_quality_control/MM24-B_deduplicated_R1.fastq.gz, MM24-B/sequence_quality_control/MM24-B_deduplicated_R2.fastq.gz output: MM24-B/sequence_quality_control/read_stats/deduplicated.zip, MM24-B/sequence_quality_control/read_stats/deduplicated_read_counts.tsv log: MM24-B/logs/QC/read_stats/deduplicated.log jobid: 93 wildcards: sample=MM24-B, step=deduplicated priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 06:16:58 2019] Finished job 93. 22 of 459 steps (5%) done

[Tue Mar 5 06:16:58 2019] rule deduplicate_reads: input: MM11-SB/sequence_quality_control/MM11-SB_raw_R1.fastq.gz, MM11-SB/sequence_quality_control/MM11-SB_raw_R2.fastq.gz output: MM11-SB/sequence_quality_control/MM11-SB_deduplicated_R1.fastq.gz, MM11-SB/sequence_quality_control/MM11-SB_deduplicated_R2.fastq.gz log: MM11-SB/logs/QC/deduplicate.log jobid: 224 benchmark: logs/benchmarks/QC/deduplicate/MM11-SB.txt wildcards: sample=MM11-SB threads: 16 resources: mem=32, java_mem=27

        clumpify.sh                 in=MM11-SB/sequence_quality_control/MM11-SB_raw_R1.fastq.gz in2=MM11-SB/sequence_quality_control/MM11-SB_raw_R2.fastq.gz                 out1=MM11-SB/sequence_quality_control/MM11-SB_deduplicated_R1.fastq.gz out2=MM11-SB/sequence_quality_control/MM11-SB_deduplicated_R2.fastq.gz                 overwrite=true                dedupe=t                 dupesubs=2                 optical=f                threads=16                 -Xmx27G 2> MM11-SB/logs/QC/deduplicate.log

Activating conda environment: /databases/conda_envs/81c7173e Removing temporary output file MM11-SB/sequence_quality_control/MM11-SB_raw_R2.fastq.gz. Removing temporary output file MM11-SB/sequence_quality_control/MM11-SB_raw_R1.fastq.gz. [Tue Mar 5 06:18:50 2019] Finished job 224. 23 of 459 steps (5%) done

[Tue Mar 5 06:18:50 2019] rule get_read_stats: input: MM11-SB/sequence_quality_control/MM11-SB_deduplicated_R1.fastq.gz, MM11-SB/sequence_quality_control/MM11-SB_deduplicated_R2.fastq.gz output: MM11-SB/sequence_quality_control/read_stats/deduplicated.zip, MM11-SB/sequence_quality_control/read_stats/deduplicated_read_counts.tsv log: MM11-SB/logs/QC/read_stats/deduplicated.log jobid: 69 wildcards: sample=MM11-SB, step=deduplicated priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 06:20:36 2019] Finished job 69. 24 of 459 steps (5%) done

[Tue Mar 5 06:20:36 2019] rule deduplicate_reads: input: MM10-SB/sequence_quality_control/MM10-SB_raw_R1.fastq.gz, MM10-SB/sequence_quality_control/MM10-SB_raw_R2.fastq.gz output: MM10-SB/sequence_quality_control/MM10-SB_deduplicated_R1.fastq.gz, MM10-SB/sequence_quality_control/MM10-SB_deduplicated_R2.fastq.gz log: MM10-SB/logs/QC/deduplicate.log jobid: 227 benchmark: logs/benchmarks/QC/deduplicate/MM10-SB.txt wildcards: sample=MM10-SB threads: 16 resources: mem=32, java_mem=27

        clumpify.sh                 in=MM10-SB/sequence_quality_control/MM10-SB_raw_R1.fastq.gz in2=MM10-SB/sequence_quality_control/MM10-SB_raw_R2.fastq.gz                 out1=MM10-SB/sequence_quality_control/MM10-SB_deduplicated_R1.fastq.gz out2=MM10-SB/sequence_quality_control/MM10-SB_deduplicated_R2.fastq.gz                 overwrite=true                dedupe=t                 dupesubs=2                 optical=f                threads=16                 -Xmx27G 2> MM10-SB/logs/QC/deduplicate.log

Activating conda environment: /databases/conda_envs/81c7173e Removing temporary output file MM10-SB/sequence_quality_control/MM10-SB_raw_R2.fastq.gz. Removing temporary output file MM10-SB/sequence_quality_control/MM10-SB_raw_R1.fastq.gz. [Tue Mar 5 06:22:33 2019] Finished job 227. 25 of 459 steps (5%) done

[Tue Mar 5 06:22:33 2019] rule get_read_stats: input: MM10-SB/sequence_quality_control/MM10-SB_deduplicated_R1.fastq.gz, MM10-SB/sequence_quality_control/MM10-SB_deduplicated_R2.fastq.gz output: MM10-SB/sequence_quality_control/read_stats/deduplicated.zip, MM10-SB/sequence_quality_control/read_stats/deduplicated_read_counts.tsv log: MM10-SB/logs/QC/read_stats/deduplicated.log jobid: 77 wildcards: sample=MM10-SB, step=deduplicated priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 06:24:23 2019] Finished job 77. 26 of 459 steps (6%) done

[Tue Mar 5 06:24:23 2019] rule deduplicate_reads: input: MM12-SB/sequence_quality_control/MM12-SB_raw_R1.fastq.gz, MM12-SB/sequence_quality_control/MM12-SB_raw_R2.fastq.gz output: MM12-SB/sequence_quality_control/MM12-SB_deduplicated_R1.fastq.gz, MM12-SB/sequence_quality_control/MM12-SB_deduplicated_R2.fastq.gz log: MM12-SB/logs/QC/deduplicate.log jobid: 218 benchmark: logs/benchmarks/QC/deduplicate/MM12-SB.txt wildcards: sample=MM12-SB threads: 16 resources: mem=32, java_mem=27

        clumpify.sh                 in=MM12-SB/sequence_quality_control/MM12-SB_raw_R1.fastq.gz in2=MM12-SB/sequence_quality_control/MM12-SB_raw_R2.fastq.gz                 out1=MM12-SB/sequence_quality_control/MM12-SB_deduplicated_R1.fastq.gz out2=MM12-SB/sequence_quality_control/MM12-SB_deduplicated_R2.fastq.gz                 overwrite=true                dedupe=t                 dupesubs=2                 optical=f                threads=16                 -Xmx27G 2> MM12-SB/logs/QC/deduplicate.log

Activating conda environment: /databases/conda_envs/81c7173e Removing temporary output file MM12-SB/sequence_quality_control/MM12-SB_raw_R2.fastq.gz. Removing temporary output file MM12-SB/sequence_quality_control/MM12-SB_raw_R1.fastq.gz. [Tue Mar 5 06:26:06 2019] Finished job 218. 27 of 459 steps (6%) done

[Tue Mar 5 06:26:06 2019] rule get_read_stats: input: MM12-SB/sequence_quality_control/MM12-SB_deduplicated_R1.fastq.gz, MM12-SB/sequence_quality_control/MM12-SB_deduplicated_R2.fastq.gz output: MM12-SB/sequence_quality_control/read_stats/deduplicated.zip, MM12-SB/sequence_quality_control/read_stats/deduplicated_read_counts.tsv log: MM12-SB/logs/QC/read_stats/deduplicated.log jobid: 53 wildcards: sample=MM12-SB, step=deduplicated priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 06:27:43 2019] Finished job 53. 28 of 459 steps (6%) done

[Tue Mar 5 06:27:43 2019] rule apply_quality_filter: input: MM14-SB/sequence_quality_control/MM14-SB_deduplicated_R1.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_deduplicated_R2.fastq.gz, /databases/adapters.fa output: MM14-SB/sequence_quality_control/MM14-SB_filtered_R1.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_filtered_R2.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_filtered_se.fastq.gz, MM14-SB/logs/MM14-SB_quality_filtering_stats.txt log: MM14-SB/logs/QC/quality_filter.log jobid: 59 benchmark: logs/benchmarks/QC/quality_filter/MM14-SB.txt wildcards: sample=MM14-SB threads: 16 resources: mem=32, java_mem=27

    bbduk.sh in=MM14-SB/sequence_quality_control/MM14-SB_deduplicated_R1.fastq.gz in2=MM14-SB/sequence_quality_control/MM14-SB_deduplicated_R2.fastq.gz             ref=/databases/adapters.fa             interleaved=f             out=MM14-SB/sequence_quality_control/MM14-SB_filtered_R1.fastq.gz out2=MM14-SB/sequence_quality_control/MM14-SB_filtered_R2.fastq.gz outs=MM14-SB/sequence_quality_control/MM14-SB_filtered_se.fastq.gz             stats=MM14-SB/logs/MM14-SB_quality_filtering_stats.txt             overwrite=true             qout=33             trd=t             hdist=1             k=27             ktrim=r             mink=8             trimq=10             qtrim=rl             threads=16             minlength=51             maxns=-1             minbasefrequency=0.05             ecco=t             prealloc=t             -Xmx27G 2> MM14-SB/logs/QC/quality_filter.log

Activating conda environment: /databases/conda_envs/81c7173e Removing temporary output file MM14-SB/sequence_quality_control/MM14-SB_deduplicated_R1.fastq.gz. Removing temporary output file MM14-SB/sequence_quality_control/MM14-SB_deduplicated_R2.fastq.gz. [Tue Mar 5 06:30:19 2019] Finished job 59. 29 of 459 steps (6%) done

[Tue Mar 5 06:30:19 2019] rule get_read_stats: input: MM14-SB/sequence_quality_control/MM14-SB_filtered_R1.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_filtered_R2.fastq.gz output: MM14-SB/sequence_quality_control/read_stats/filtered.zip, MM14-SB/sequence_quality_control/read_stats/filtered_read_counts.tsv log: MM14-SB/logs/QC/read_stats/filtered.log jobid: 62 wildcards: sample=MM14-SB, step=filtered priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 06:32:24 2019] Finished job 62. 30 of 459 steps (7%) done

[Tue Mar 5 06:32:24 2019] rule apply_quality_filter: input: MM13-SB/sequence_quality_control/MM13-SB_deduplicated_R1.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_deduplicated_R2.fastq.gz, /databases/adapters.fa output: MM13-SB/sequence_quality_control/MM13-SB_filtered_R1.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_filtered_R2.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_filtered_se.fastq.gz, MM13-SB/logs/MM13-SB_quality_filtering_stats.txt log: MM13-SB/logs/QC/quality_filter.log jobid: 43 benchmark: logs/benchmarks/QC/quality_filter/MM13-SB.txt wildcards: sample=MM13-SB threads: 16 resources: mem=32, java_mem=27

    bbduk.sh in=MM13-SB/sequence_quality_control/MM13-SB_deduplicated_R1.fastq.gz in2=MM13-SB/sequence_quality_control/MM13-SB_deduplicated_R2.fastq.gz             ref=/databases/adapters.fa             interleaved=f             out=MM13-SB/sequence_quality_control/MM13-SB_filtered_R1.fastq.gz out2=MM13-SB/sequence_quality_control/MM13-SB_filtered_R2.fastq.gz outs=MM13-SB/sequence_quality_control/MM13-SB_filtered_se.fastq.gz             stats=MM13-SB/logs/MM13-SB_quality_filtering_stats.txt             overwrite=true             qout=33             trd=t             hdist=1             k=27             ktrim=r             mink=8             trimq=10             qtrim=rl             threads=16             minlength=51             maxns=-1             minbasefrequency=0.05             ecco=t             prealloc=t             -Xmx27G 2> MM13-SB/logs/QC/quality_filter.log

Activating conda environment: /databases/conda_envs/81c7173e Removing temporary output file MM13-SB/sequence_quality_control/MM13-SB_deduplicated_R1.fastq.gz. Removing temporary output file MM13-SB/sequence_quality_control/MM13-SB_deduplicated_R2.fastq.gz. [Tue Mar 5 06:34:47 2019] Finished job 43. 31 of 459 steps (7%) done

[Tue Mar 5 06:34:47 2019] rule get_read_stats: input: MM13-SB/sequence_quality_control/MM13-SB_filtered_R1.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_filtered_R2.fastq.gz output: MM13-SB/sequence_quality_control/read_stats/filtered.zip, MM13-SB/sequence_quality_control/read_stats/filtered_read_counts.tsv log: MM13-SB/logs/QC/read_stats/filtered.log jobid: 46 wildcards: sample=MM13-SB, step=filtered priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 06:36:48 2019] Finished job 46. 32 of 459 steps (7%) done

[Tue Mar 5 06:36:48 2019] rule apply_quality_filter: input: MM15-SB/sequence_quality_control/MM15-SB_deduplicated_R1.fastq.gz, MM15-SB/sequence_quality_control/MM15-SB_deduplicated_R2.fastq.gz, /databases/adapters.fa output: MM15-SB/sequence_quality_control/MM15-SB_filtered_R1.fastq.gz, MM15-SB/sequence_quality_control/MM15-SB_filtered_R2.fastq.gz, MM15-SB/sequence_quality_control/MM15-SB_filtered_se.fastq.gz, MM15-SB/logs/MM15-SB_quality_filtering_stats.txt log: MM15-SB/logs/QC/quality_filter.log jobid: 83 benchmark: logs/benchmarks/QC/quality_filter/MM15-SB.txt wildcards: sample=MM15-SB threads: 16 resources: mem=32, java_mem=27

    bbduk.sh in=MM15-SB/sequence_quality_control/MM15-SB_deduplicated_R1.fastq.gz in2=MM15-SB/sequence_quality_control/MM15-SB_deduplicated_R2.fastq.gz             ref=/databases/adapters.fa             interleaved=f             out=MM15-SB/sequence_quality_control/MM15-SB_filtered_R1.fastq.gz out2=MM15-SB/sequence_quality_control/MM15-SB_filtered_R2.fastq.gz outs=MM15-SB/sequence_quality_control/MM15-SB_filtered_se.fastq.gz             stats=MM15-SB/logs/MM15-SB_quality_filtering_stats.txt             overwrite=true             qout=33             trd=t             hdist=1             k=27             ktrim=r             mink=8             trimq=10             qtrim=rl             threads=16             minlength=51             maxns=-1             minbasefrequency=0.05             ecco=t             prealloc=t             -Xmx27G 2> MM15-SB/logs/QC/quality_filter.log

Activating conda environment: /databases/conda_envs/81c7173e Removing temporary output file MM15-SB/sequence_quality_control/MM15-SB_deduplicated_R2.fastq.gz. Removing temporary output file MM15-SB/sequence_quality_control/MM15-SB_deduplicated_R1.fastq.gz. [Tue Mar 5 06:39:05 2019] Finished job 83. 33 of 459 steps (7%) done

[Tue Mar 5 06:39:05 2019] rule get_read_stats: input: MM15-SB/sequence_quality_control/MM15-SB_filtered_R1.fastq.gz, MM15-SB/sequence_quality_control/MM15-SB_filtered_R2.fastq.gz output: MM15-SB/sequence_quality_control/read_stats/filtered.zip, MM15-SB/sequence_quality_control/read_stats/filtered_read_counts.tsv log: MM15-SB/logs/QC/read_stats/filtered.log jobid: 86 wildcards: sample=MM15-SB, step=filtered priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 06:41:00 2019] Finished job 86. 34 of 459 steps (7%) done

[Tue Mar 5 06:41:00 2019] rule apply_quality_filter: input: MM10-SB/sequence_quality_control/MM10-SB_deduplicated_R1.fastq.gz, MM10-SB/sequence_quality_control/MM10-SB_deduplicated_R2.fastq.gz, /databases/adapters.fa output: MM10-SB/sequence_quality_control/MM10-SB_filtered_R1.fastq.gz, MM10-SB/sequence_quality_control/MM10-SB_filtered_R2.fastq.gz, MM10-SB/sequence_quality_control/MM10-SB_filtered_se.fastq.gz, MM10-SB/logs/MM10-SB_quality_filtering_stats.txt log: MM10-SB/logs/QC/quality_filter.log jobid: 75 benchmark: logs/benchmarks/QC/quality_filter/MM10-SB.txt wildcards: sample=MM10-SB threads: 16 resources: mem=32, java_mem=27

    bbduk.sh in=MM10-SB/sequence_quality_control/MM10-SB_deduplicated_R1.fastq.gz in2=MM10-SB/sequence_quality_control/MM10-SB_deduplicated_R2.fastq.gz             ref=/databases/adapters.fa             interleaved=f             out=MM10-SB/sequence_quality_control/MM10-SB_filtered_R1.fastq.gz out2=MM10-SB/sequence_quality_control/MM10-SB_filtered_R2.fastq.gz outs=MM10-SB/sequence_quality_control/MM10-SB_filtered_se.fastq.gz             stats=MM10-SB/logs/MM10-SB_quality_filtering_stats.txt             overwrite=true             qout=33             trd=t             hdist=1             k=27             ktrim=r             mink=8             trimq=10             qtrim=rl             threads=16             minlength=51             maxns=-1             minbasefrequency=0.05             ecco=t             prealloc=t             -Xmx27G 2> MM10-SB/logs/QC/quality_filter.log

Activating conda environment: /databases/conda_envs/81c7173e Removing temporary output file MM10-SB/sequence_quality_control/MM10-SB_deduplicated_R2.fastq.gz. Removing temporary output file MM10-SB/sequence_quality_control/MM10-SB_deduplicated_R1.fastq.gz. [Tue Mar 5 06:43:22 2019] Finished job 75. 35 of 459 steps (8%) done

[Tue Mar 5 06:43:22 2019] rule get_read_stats: input: MM10-SB/sequence_quality_control/MM10-SB_filtered_R1.fastq.gz, MM10-SB/sequence_quality_control/MM10-SB_filtered_R2.fastq.gz output: MM10-SB/sequence_quality_control/read_stats/filtered.zip, MM10-SB/sequence_quality_control/read_stats/filtered_read_counts.tsv log: MM10-SB/logs/QC/read_stats/filtered.log jobid: 78 wildcards: sample=MM10-SB, step=filtered priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 06:45:11 2019] Finished job 78. 36 of 459 steps (8%) done

[Tue Mar 5 06:45:11 2019] rule apply_quality_filter: input: MM12-SB/sequence_quality_control/MM12-SB_deduplicated_R1.fastq.gz, MM12-SB/sequence_quality_control/MM12-SB_deduplicated_R2.fastq.gz, /databases/adapters.fa output: MM12-SB/sequence_quality_control/MM12-SB_filtered_R1.fastq.gz, MM12-SB/sequence_quality_control/MM12-SB_filtered_R2.fastq.gz, MM12-SB/sequence_quality_control/MM12-SB_filtered_se.fastq.gz, MM12-SB/logs/MM12-SB_quality_filtering_stats.txt log: MM12-SB/logs/QC/quality_filter.log jobid: 51 benchmark: logs/benchmarks/QC/quality_filter/MM12-SB.txt wildcards: sample=MM12-SB threads: 16 resources: mem=32, java_mem=27

    bbduk.sh in=MM12-SB/sequence_quality_control/MM12-SB_deduplicated_R1.fastq.gz in2=MM12-SB/sequence_quality_control/MM12-SB_deduplicated_R2.fastq.gz             ref=/databases/adapters.fa             interleaved=f             out=MM12-SB/sequence_quality_control/MM12-SB_filtered_R1.fastq.gz out2=MM12-SB/sequence_quality_control/MM12-SB_filtered_R2.fastq.gz outs=MM12-SB/sequence_quality_control/MM12-SB_filtered_se.fastq.gz             stats=MM12-SB/logs/MM12-SB_quality_filtering_stats.txt             overwrite=true             qout=33             trd=t             hdist=1             k=27             ktrim=r             mink=8             trimq=10             qtrim=rl             threads=16             minlength=51             maxns=-1             minbasefrequency=0.05             ecco=t             prealloc=t             -Xmx27G 2> MM12-SB/logs/QC/quality_filter.log

Activating conda environment: /databases/conda_envs/81c7173e Removing temporary output file MM12-SB/sequence_quality_control/MM12-SB_deduplicated_R2.fastq.gz. Removing temporary output file MM12-SB/sequence_quality_control/MM12-SB_deduplicated_R1.fastq.gz. [Tue Mar 5 06:47:11 2019] Finished job 51. 37 of 459 steps (8%) done

[Tue Mar 5 06:47:11 2019] rule get_read_stats: input: MM12-SB/sequence_quality_control/MM12-SB_filtered_R1.fastq.gz, MM12-SB/sequence_quality_control/MM12-SB_filtered_R2.fastq.gz output: MM12-SB/sequence_quality_control/read_stats/filtered.zip, MM12-SB/sequence_quality_control/read_stats/filtered_read_counts.tsv log: MM12-SB/logs/QC/read_stats/filtered.log jobid: 54 wildcards: sample=MM12-SB, step=filtered priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 06:48:37 2019] Finished job 54. 38 of 459 steps (8%) done

[Tue Mar 5 06:48:37 2019] rule apply_quality_filter: input: MM11-SB/sequence_quality_control/MM11-SB_deduplicated_R1.fastq.gz, MM11-SB/sequence_quality_control/MM11-SB_deduplicated_R2.fastq.gz, /databases/adapters.fa output: MM11-SB/sequence_quality_control/MM11-SB_filtered_R1.fastq.gz, MM11-SB/sequence_quality_control/MM11-SB_filtered_R2.fastq.gz, MM11-SB/sequence_quality_control/MM11-SB_filtered_se.fastq.gz, MM11-SB/logs/MM11-SB_quality_filtering_stats.txt log: MM11-SB/logs/QC/quality_filter.log jobid: 67 benchmark: logs/benchmarks/QC/quality_filter/MM11-SB.txt wildcards: sample=MM11-SB threads: 16 resources: mem=32, java_mem=27

    bbduk.sh in=MM11-SB/sequence_quality_control/MM11-SB_deduplicated_R1.fastq.gz in2=MM11-SB/sequence_quality_control/MM11-SB_deduplicated_R2.fastq.gz             ref=/databases/adapters.fa             interleaved=f             out=MM11-SB/sequence_quality_control/MM11-SB_filtered_R1.fastq.gz out2=MM11-SB/sequence_quality_control/MM11-SB_filtered_R2.fastq.gz outs=MM11-SB/sequence_quality_control/MM11-SB_filtered_se.fastq.gz             stats=MM11-SB/logs/MM11-SB_quality_filtering_stats.txt             overwrite=true             qout=33             trd=t             hdist=1             k=27             ktrim=r             mink=8             trimq=10             qtrim=rl             threads=16             minlength=51             maxns=-1             minbasefrequency=0.05             ecco=t             prealloc=t             -Xmx27G 2> MM11-SB/logs/QC/quality_filter.log

Activating conda environment: /databases/conda_envs/81c7173e Removing temporary output file MM11-SB/sequence_quality_control/MM11-SB_deduplicated_R1.fastq.gz. Removing temporary output file MM11-SB/sequence_quality_control/MM11-SB_deduplicated_R2.fastq.gz. [Tue Mar 5 06:50:44 2019] Finished job 67. 39 of 459 steps (8%) done

[Tue Mar 5 06:50:44 2019] rule get_read_stats: input: MM11-SB/sequence_quality_control/MM11-SB_filtered_R1.fastq.gz, MM11-SB/sequence_quality_control/MM11-SB_filtered_R2.fastq.gz output: MM11-SB/sequence_quality_control/read_stats/filtered.zip, MM11-SB/sequence_quality_control/read_stats/filtered_read_counts.tsv log: MM11-SB/logs/QC/read_stats/filtered.log jobid: 70 wildcards: sample=MM11-SB, step=filtered priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 06:52:33 2019] Finished job 70. 40 of 459 steps (9%) done

[Tue Mar 5 06:52:33 2019] rule apply_quality_filter: input: MM24-B/sequence_quality_control/MM24-B_deduplicated_R1.fastq.gz, MM24-B/sequence_quality_control/MM24-B_deduplicated_R2.fastq.gz, /databases/adapters.fa output: MM24-B/sequence_quality_control/MM24-B_filtered_R1.fastq.gz, MM24-B/sequence_quality_control/MM24-B_filtered_R2.fastq.gz, MM24-B/sequence_quality_control/MM24-B_filtered_se.fastq.gz, MM24-B/logs/MM24-B_quality_filtering_stats.txt log: MM24-B/logs/QC/quality_filter.log jobid: 91 benchmark: logs/benchmarks/QC/quality_filter/MM24-B.txt wildcards: sample=MM24-B threads: 16 resources: mem=32, java_mem=27

    bbduk.sh in=MM24-B/sequence_quality_control/MM24-B_deduplicated_R1.fastq.gz in2=MM24-B/sequence_quality_control/MM24-B_deduplicated_R2.fastq.gz             ref=/databases/adapters.fa             interleaved=f             out=MM24-B/sequence_quality_control/MM24-B_filtered_R1.fastq.gz out2=MM24-B/sequence_quality_control/MM24-B_filtered_R2.fastq.gz outs=MM24-B/sequence_quality_control/MM24-B_filtered_se.fastq.gz             stats=MM24-B/logs/MM24-B_quality_filtering_stats.txt             overwrite=true             qout=33             trd=t             hdist=1             k=27             ktrim=r             mink=8             trimq=10             qtrim=rl             threads=16             minlength=51             maxns=-1             minbasefrequency=0.05             ecco=t             prealloc=t             -Xmx27G 2> MM24-B/logs/QC/quality_filter.log

Activating conda environment: /databases/conda_envs/81c7173e Removing temporary output file MM24-B/sequence_quality_control/MM24-B_deduplicated_R2.fastq.gz. Removing temporary output file MM24-B/sequence_quality_control/MM24-B_deduplicated_R1.fastq.gz. [Tue Mar 5 06:54:03 2019] Finished job 91. 41 of 459 steps (9%) done

[Tue Mar 5 06:54:03 2019] rule get_read_stats: input: MM24-B/sequence_quality_control/MM24-B_filtered_R1.fastq.gz, MM24-B/sequence_quality_control/MM24-B_filtered_R2.fastq.gz output: MM24-B/sequence_quality_control/read_stats/filtered.zip, MM24-B/sequence_quality_control/read_stats/filtered_read_counts.tsv log: MM24-B/logs/QC/read_stats/filtered.log jobid: 94 wildcards: sample=MM24-B, step=filtered priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 06:55:03 2019] Finished job 94. 42 of 459 steps (9%) done

[Tue Mar 5 06:55:03 2019] rule initialize_checkm: input: /databases/checkm/taxon_marker_sets.tsv, /databases/checkm/selected_marker_sets.tsv, /databases/checkm/pfam/tigrfam2pfam.tsv, /databases/checkm/pfam/Pfam-A.hmm.dat, /databases/checkm/img/img_metadata.tsv, /databases/checkm/hmms_ssu/SSU_euk.hmm, /databases/checkm/hmms_ssu/SSU_bacteria.hmm, /databases/checkm/hmms_ssu/SSU_archaea.hmm, /databases/checkm/hmms_ssu/createHMMs.py, /databases/checkm/hmms/phylo.hmm.ssi, /databases/checkm/hmms/phylo.hmm, /databases/checkm/hmms/checkm.hmm.ssi, /databases/checkm/hmms/checkm.hmm, /databases/checkm/genome_tree/missing_duplicate_genes_97.tsv, /databases/checkm/genome_tree/missing_duplicate_genes_50.tsv, /databases/checkm/genome_tree/genome_tree.taxonomy.tsv, /databases/checkm/genome_tree/genome_tree_reduced.refpkg/phylomodelJqWx6.json, /databases/checkm/genome_tree/genome_tree_reduced.refpkg/genome_tree.tre, /databases/checkm/genome_tree/genome_tree_reduced.refpkg/genome_tree.log, /databases/checkm/genome_tree/genome_tree_reduced.refpkg/genome_tree.fasta, /databases/checkm/genome_tree/genome_tree_reduced.refpkg/CONTENTS.json, /databases/checkm/genome_tree/genome_tree.metadata.tsv, /databases/checkm/genome_tree/genome_tree_full.refpkg/phylo_modelEcOyPk.json, /databases/checkm/genome_tree/genome_tree_full.refpkg/genome_tree.tre, /databases/checkm/genome_tree/genome_tree_full.refpkg/genome_tree.log, /databases/checkm/genome_tree/genome_tree_full.refpkg/genome_tree.fasta, /databases/checkm/genome_tree/genome_tree_full.refpkg/CONTENTS.json, /databases/checkm/genome_tree/genome_tree.derep.txt, /databases/checkm/.dmanifest, /databases/checkm/distributions/td_dist.txt, /databases/checkm/distributions/gc_dist.txt, /databases/checkm/distributions/cd_dist.txt output: logs/checkm_init.txt log: logs/initialize_checkm.log jobid: 284

    python /opt/conda/lib/python3.6/site-packages/atlas/rules/initialize_checkm.py             /databases/checkm             logs/checkm_init.txt             logs/initialize_checkm.log

Activating conda environment: /databases/conda_envs/420ace10 [Tue Mar 5 06:55:23 2019] Finished job 284. 43 of 459 steps (9%) done

[Tue Mar 5 06:55:23 2019] rule build_decontamination_db: input: /databases/phiX174_virus.fa output: ref/genome/1/summary.txt log: logs/QC/build_decontamination_db.log jobid: 309 threads: 16 resources: mem=32, java_mem=27

        bbsplit.sh -Xmx27G ref_PhiX=/databases/phiX174_virus.fa                 threads=16 k=13 local=t 2> logs/QC/build_decontamination_db.log

Activating conda environment: /databases/conda_envs/81c7173e [Tue Mar 5 06:55:27 2019] Finished job 309. 44 of 459 steps (10%) done

[Tue Mar 5 06:55:27 2019] rule run_decontamination: input: MM14-SB/sequence_quality_control/MM14-SB_filtered_R1.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_filtered_R2.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_filtered_se.fastq.gz, ref/genome/1/summary.txt output: MM14-SB/sequence_quality_control/MM14-SB_clean_R1.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_clean_R2.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_clean_se.fastq.gz, MM14-SB/sequence_quality_control/contaminants/PhiX_R1.fastq.gz, MM14-SB/sequence_quality_control/contaminants/PhiX_R2.fastq.gz, MM14-SB/sequence_quality_control/contaminants/PhiX_se.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_decontamination_reference_stats.txt log: MM14-SB/logs/QC/decontamination.log jobid: 220 benchmark: logs/benchmarks/QC/decontamination/MM14-SB.txt wildcards: sample=MM14-SB threads: 16 resources: mem=32, java_mem=27

        if [ "true" = true ] ; then
            bbsplit.sh in1=MM14-SB/sequence_quality_control/MM14-SB_filtered_R1.fastq.gz in2=MM14-SB/sequence_quality_control/MM14-SB_filtered_R2.fastq.gz                     outu1=MM14-SB/sequence_quality_control/MM14-SB_clean_R1.fastq.gz outu2=MM14-SB/sequence_quality_control/MM14-SB_clean_R2.fastq.gz                     basename="MM14-SB/sequence_quality_control/contaminants/%_R#.fastq.gz"                     maxindel=20 minratio=0.65                     minhits=1 ambiguous=best refstats=MM14-SB/sequence_quality_control/MM14-SB_decontamination_reference_stats.txt                    threads=16 k=13 local=t                     -Xmx27G 2> MM14-SB/logs/QC/decontamination.log
        fi

        bbsplit.sh in=MM14-SB/sequence_quality_control/MM14-SB_filtered_se.fastq.gz                  outu=MM14-SB/sequence_quality_control/MM14-SB_clean_se.fastq.gz                 basename="MM14-SB/sequence_quality_control/contaminants/%_se.fastq.gz"                 maxindel=20 minratio=0.65                 minhits=1 ambiguous=best refstats=MM14-SB/sequence_quality_control/MM14-SB_decontamination_reference_stats.txt append                 interleaved=f threads=16 k=13 local=t                 -Xmx27G 2>> MM14-SB/logs/QC/decontamination.log

Activating conda environment: /databases/conda_envs/81c7173e Removing temporary output file MM14-SB/sequence_quality_control/MM14-SB_filtered_R1.fastq.gz. Removing temporary output file MM14-SB/sequence_quality_control/MM14-SB_filtered_R2.fastq.gz. Removing temporary output file MM14-SB/sequence_quality_control/MM14-SB_filtered_se.fastq.gz. [Tue Mar 5 06:57:03 2019] Finished job 220. 45 of 459 steps (10%) done

[Tue Mar 5 06:57:03 2019] rule get_read_stats: input: MM14-SB/sequence_quality_control/MM14-SB_clean_R1.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_clean_R2.fastq.gz output: MM14-SB/sequence_quality_control/read_stats/clean.zip, MM14-SB/sequence_quality_control/read_stats/clean_read_counts.tsv log: MM14-SB/logs/QC/read_stats/clean.log jobid: 63 wildcards: sample=MM14-SB, step=clean priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 06:59:09 2019] Finished job 63. 46 of 459 steps (10%) done

[Tue Mar 5 06:59:09 2019] localrule qcreads: input: MM14-SB/sequence_quality_control/MM14-SB_clean_R1.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_clean_R2.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_clean_se.fastq.gz output: MM14-SB/sequence_quality_control/MM14-SB_QC_R1.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_QC_R2.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_QC_se.fastq.gz jobid: 58 wildcards: sample=MM14-SB

Removing temporary output file MM14-SB/sequence_quality_control/MM14-SB_clean_R1.fastq.gz. Removing temporary output file MM14-SB/sequence_quality_control/MM14-SB_clean_R2.fastq.gz. Removing temporary output file MM14-SB/sequence_quality_control/MM14-SB_clean_se.fastq.gz. [Tue Mar 5 06:59:15 2019] Finished job 58. 47 of 459 steps (10%) done

[Tue Mar 5 06:59:15 2019] rule get_read_stats: input: MM14-SB/sequence_quality_control/MM14-SB_QC_R1.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_QC_R2.fastq.gz output: MM14-SB/sequence_quality_control/read_stats/QC.zip, MM14-SB/sequence_quality_control/read_stats/QC_read_counts.tsv log: MM14-SB/logs/QC/read_stats/QC.log jobid: 64 wildcards: sample=MM14-SB, step=QC priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 07:01:21 2019] Finished job 64. 48 of 459 steps (10%) done

[Tue Mar 5 07:01:21 2019] rule run_decontamination: input: MM13-SB/sequence_quality_control/MM13-SB_filtered_R1.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_filtered_R2.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_filtered_se.fastq.gz, ref/genome/1/summary.txt output: MM13-SB/sequence_quality_control/MM13-SB_clean_R1.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_clean_R2.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_clean_se.fastq.gz, MM13-SB/sequence_quality_control/contaminants/PhiX_R1.fastq.gz, MM13-SB/sequence_quality_control/contaminants/PhiX_R2.fastq.gz, MM13-SB/sequence_quality_control/contaminants/PhiX_se.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_decontamination_reference_stats.txt log: MM13-SB/logs/QC/decontamination.log jobid: 213 benchmark: logs/benchmarks/QC/decontamination/MM13-SB.txt wildcards: sample=MM13-SB threads: 16 resources: mem=32, java_mem=27

        if [ "true" = true ] ; then
            bbsplit.sh in1=MM13-SB/sequence_quality_control/MM13-SB_filtered_R1.fastq.gz in2=MM13-SB/sequence_quality_control/MM13-SB_filtered_R2.fastq.gz                     outu1=MM13-SB/sequence_quality_control/MM13-SB_clean_R1.fastq.gz outu2=MM13-SB/sequence_quality_control/MM13-SB_clean_R2.fastq.gz                     basename="MM13-SB/sequence_quality_control/contaminants/%_R#.fastq.gz"                     maxindel=20 minratio=0.65                     minhits=1 ambiguous=best refstats=MM13-SB/sequence_quality_control/MM13-SB_decontamination_reference_stats.txt                    threads=16 k=13 local=t                     -Xmx27G 2> MM13-SB/logs/QC/decontamination.log
        fi

        bbsplit.sh in=MM13-SB/sequence_quality_control/MM13-SB_filtered_se.fastq.gz                  outu=MM13-SB/sequence_quality_control/MM13-SB_clean_se.fastq.gz                 basename="MM13-SB/sequence_quality_control/contaminants/%_se.fastq.gz"                 maxindel=20 minratio=0.65                 minhits=1 ambiguous=best refstats=MM13-SB/sequence_quality_control/MM13-SB_decontamination_reference_stats.txt append                 interleaved=f threads=16 k=13 local=t                 -Xmx27G 2>> MM13-SB/logs/QC/decontamination.log

Activating conda environment: /databases/conda_envs/81c7173e Removing temporary output file MM13-SB/sequence_quality_control/MM13-SB_filtered_R1.fastq.gz. Removing temporary output file MM13-SB/sequence_quality_control/MM13-SB_filtered_R2.fastq.gz. Removing temporary output file MM13-SB/sequence_quality_control/MM13-SB_filtered_se.fastq.gz. [Tue Mar 5 07:02:52 2019] Finished job 213. 49 of 459 steps (11%) done

[Tue Mar 5 07:02:52 2019] rule get_read_stats: input: MM13-SB/sequence_quality_control/MM13-SB_clean_R1.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_clean_R2.fastq.gz output: MM13-SB/sequence_quality_control/read_stats/clean.zip, MM13-SB/sequence_quality_control/read_stats/clean_read_counts.tsv log: MM13-SB/logs/QC/read_stats/clean.log jobid: 47 wildcards: sample=MM13-SB, step=clean priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 07:04:50 2019] Finished job 47. 50 of 459 steps (11%) done

[Tue Mar 5 07:04:50 2019] localrule init_pre_assembly_processing: input: MM14-SB/sequence_quality_control/MM14-SB_QC_R1.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_QC_R2.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_QC_se.fastq.gz output: MM14-SB/assembly/reads/QC_se.fastq.gz jobid: 443 wildcards: sample=MM14-SB, fraction=se

[Tue Mar 5 07:04:50 2019] localrule init_pre_assembly_processing: input: MM14-SB/sequence_quality_control/MM14-SB_QC_R1.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_QC_R2.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_QC_se.fastq.gz output: MM14-SB/assembly/reads/QC_R2.fastq.gz jobid: 442 wildcards: sample=MM14-SB, fraction=R2

[Tue Mar 5 07:04:50 2019] localrule qcreads: input: MM13-SB/sequence_quality_control/MM13-SB_clean_R1.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_clean_R2.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_clean_se.fastq.gz output: MM13-SB/sequence_quality_control/MM13-SB_QC_R1.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_QC_R2.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_QC_se.fastq.gz jobid: 42 wildcards: sample=MM13-SB

[Tue Mar 5 07:04:50 2019] localrule init_pre_assembly_processing: input: MM14-SB/sequence_quality_control/MM14-SB_QC_R1.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_QC_R2.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_QC_se.fastq.gz output: MM14-SB/assembly/reads/QC_R1.fastq.gz jobid: 441 wildcards: sample=MM14-SB, fraction=R1

[Tue Mar 5 07:04:50 2019] localrule write_read_counts: input: MM14-SB/sequence_quality_control/read_stats/raw_read_counts.tsv, MM14-SB/sequence_quality_control/read_stats/deduplicated_read_counts.tsv, MM14-SB/sequence_quality_control/read_stats/filtered_read_counts.tsv, MM14-SB/sequence_quality_control/read_stats/clean_read_counts.tsv, MM14-SB/sequence_quality_control/read_stats/QC_read_counts.tsv output: MM14-SB/sequence_quality_control/read_stats/read_counts.tsv jobid: 237 wildcards: sample=MM14-SB

[Tue Mar 5 07:04:52 2019] Finished job 441. 51 of 459 steps (11%) done [Tue Mar 5 07:04:53 2019] Finished job 442. 52 of 459 steps (11%) done [Tue Mar 5 07:04:53 2019] Finished job 443. 53 of 459 steps (12%) done Removing temporary output file MM14-SB/sequence_quality_control/read_stats/raw_read_counts.tsv. Removing temporary output file MM14-SB/sequence_quality_control/read_stats/deduplicated_read_counts.tsv. Removing temporary output file MM14-SB/sequence_quality_control/read_stats/filtered_read_counts.tsv. Removing temporary output file MM14-SB/sequence_quality_control/read_stats/clean_read_counts.tsv. Removing temporary output file MM14-SB/sequence_quality_control/read_stats/QC_read_counts.tsv. [Tue Mar 5 07:04:53 2019] Finished job 237. 54 of 459 steps (12%) done Removing temporary output file MM13-SB/sequence_quality_control/MM13-SB_clean_R2.fastq.gz. Removing temporary output file MM13-SB/sequence_quality_control/MM13-SB_clean_se.fastq.gz. Removing temporary output file MM13-SB/sequence_quality_control/MM13-SB_clean_R1.fastq.gz. [Tue Mar 5 07:04:56 2019] Finished job 42. 55 of 459 steps (12%) done

[Tue Mar 5 07:04:56 2019] rule get_read_stats: input: MM13-SB/sequence_quality_control/MM13-SB_QC_R1.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_QC_R2.fastq.gz output: MM13-SB/sequence_quality_control/read_stats/QC.zip, MM13-SB/sequence_quality_control/read_stats/QC_read_counts.tsv log: MM13-SB/logs/QC/read_stats/QC.log jobid: 48 wildcards: sample=MM13-SB, step=QC priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 07:07:00 2019] Finished job 48. 56 of 459 steps (12%) done

[Tue Mar 5 07:07:00 2019] rule error_correction: input: MM14-SB/assembly/reads/QC_R1.fastq.gz, MM14-SB/assembly/reads/QC_R2.fastq.gz, MM14-SB/assembly/reads/QC_se.fastq.gz output: MM14-SB/assembly/reads/QC.errorcorr_R1.fastq.gz, MM14-SB/assembly/reads/QC.errorcorr_R2.fastq.gz, MM14-SB/assembly/reads/QC.errorcorr_se.fastq.gz log: MM14-SB/logs/assembly/pre_process/error_correction_QC.log jobid: 423 benchmark: logs/benchmarks/assembly/pre_process/MM14-SB_error_correction_QC.txt wildcards: sample=MM14-SB, previous_steps=QC threads: 16 resources: mem=32, java_mem=27

    tadpole.sh -Xmx27G             prealloc=1             in1=MM14-SB/assembly/reads/QC_R1.fastq.gz,MM14-SB/assembly/reads/QC_se.fastq.gz in2=MM14-SB/assembly/reads/QC_R2.fastq.gz             out1=MM14-SB/assembly/reads/QC.errorcorr_R1.fastq.gz,MM14-SB/assembly/reads/QC.errorcorr_se.fastq.gz out2=MM14-SB/assembly/reads/QC.errorcorr_R2.fastq.gz             mode=correct             threads=16             ecc=t ecco=t 2>> MM14-SB/logs/assembly/pre_process/error_correction_QC.log

Activating conda environment: /databases/conda_envs/81c7173e Removing temporary output file MM14-SB/assembly/reads/QC_R1.fastq.gz. Removing temporary output file MM14-SB/assembly/reads/QC_R2.fastq.gz. Removing temporary output file MM14-SB/assembly/reads/QC_se.fastq.gz. [Tue Mar 5 07:11:17 2019] Finished job 423. 57 of 459 steps (12%) done

[Tue Mar 5 07:11:17 2019] rule merge_pairs: input: MM14-SB/assembly/reads/QC.errorcorr_R1.fastq.gz, MM14-SB/assembly/reads/QC.errorcorr_R2.fastq.gz, MM14-SB/assembly/reads/QC.errorcorr_se.fastq.gz output: MM14-SB/assembly/reads/QC.errorcorr.merged_R1.fastq.gz, MM14-SB/assembly/reads/QC.errorcorr.merged_R2.fastq.gz, MM14-SB/assembly/reads/QC.errorcorr.merged_se.fastq.gz, MM14-SB/assembly/reads/QC.errorcorr.merged_me.fastq.gz log: MM14-SB/logs/assembly/pre_process/merge_pairs_QC.errorcorr.log jobid: 401 benchmark: logs/benchmarks/assembly/pre_process/merge_pairs_QC.errorcorr/MM14-SB.txt wildcards: sample=MM14-SB, previous_steps=QC.errorcorr threads: 16 resources: mem=32, java_mem=27

    bbmerge.sh -Xmx27G threads=16             in1=MM14-SB/assembly/reads/QC.errorcorr_R1.fastq.gz in2=MM14-SB/assembly/reads/QC.errorcorr_R2.fastq.gz             o
SilasK commented 5 years ago

Probably your attached files didn't pass through.

Yes you can parametrize the (rna) spades in the config file, generated with atlas init:


# Spades
#------------
spades_skip_BayesHammer: False
spades_use_scaffolds: false # otherwise use contigs
#Comma-separated list of k-mer sizes to be used (all values must be odd, less than 128 and listed in ascending order).
spades_k: auto
spades_preset: rna    # meta, ,normal, rna  single end libraries doesn't work for metaspades
spades_extra: "" # extra keywords 

# Filtering
#------------
prefilter_minimum_contig_length: 200
# filter out assembled noise
# after filtering
minimum_contig_length: 300
MCamp91 commented 5 years ago

Hello,

I've reattached the files. Can you view them?

Would lowering the contig length make a difference?

Cheers,

Matt


From: Silas Kieser notifications@github.com Sent: Thursday, 7 March 2019 4:53:27 PM To: metagenome-atlas/atlas Cc: Matthew Campbell; Mention Subject: Re: [metagenome-atlas/atlas] Atlas for metatranscriptomes (#143)

Probably your attached files didn't pass through.

Yes you can parametrize the (rna) spades in the config file, generated with atlas init:

Spades

------------

spades_skip_BayesHammer: False spades_use_scaffolds: false # otherwise use contigs

Comma-separated list of k-mer sizes to be used (all values must be odd, less than 128 and listed in ascending order).

spades_k: auto spades_preset: rna # meta, ,normal, rna single end libraries doesn't work for metaspades spades_extra: "" # extra keywords

Filtering

------------

prefilter_minimum_contig_length: 200

filter out assembled noise

after filtering

minimum_contig_length: 300

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/metagenome-atlas/atlas/issues/143#issuecomment-470439703, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AtYbXYrg-_0E-IErHEQWg4jrzRWKxOQBks5vUNOHgaJpZM4Xsvhl.

Dynamic output is deprecated in favor of checkpoints (see docs). It will be removed in Snakemake 6.0. Dynamic output is deprecated in favor of checkpoints (see docs). It will be removed in Snakemake 6.0. Dynamic output is deprecated in favor of checkpoints (see docs). It will be removed in Snakemake 6.0. Dynamic output is deprecated in favor of checkpoints (see docs). It will be removed in Snakemake 6.0. Dynamic output is deprecated in favor of checkpoints (see docs). It will be removed in Snakemake 6.0. Dynamic output is deprecated in favor of checkpoints (see docs). It will be removed in Snakemake 6.0. Dynamic output is deprecated in favor of checkpoints (see docs). It will be removed in Snakemake 6.0. Building DAG of jobs... Using shell: /bin/bash Provided cores: 16 Rules claiming more threads will be scaled down. Unlimited resources: mem, java_mem Job counts: count jobs 7 align_reads_to_MAGs 7 align_reads_to_final_contigs 7 align_reads_to_prefilter_contigs 1 all 7 apply_quality_filter 1 assembly 7 assembly_one_sample 7 bam_2_sam_MAGs 7 bam_2_sam_binning 7 bam_2_sam_contigs 1 binning 1 build_assembly_report 1 build_bin_report 1 build_db_genomes 1 build_decontamination_db 1 build_qc_report 14 calculate_contigs_stats 7 calculate_insert_size 1 cat_get_name 1 cat_on_bin 1 cluster_genes 1 combine_annotations 1 combine_bined_coverages_MAGs 7 combine_coverages 1 combine_coverages_MAGs 1 combine_insert_stats 1 combine_read_counts 1 combine_read_length_stats 1 concat_genes 7 convert_concoct_csv_to_tsv 14 convert_sam_to_bam 7 deduplicate_reads 1 download_cat_db 1 download_eggNOG_fastas 3 download_eggNOG_files 1 eggNOG_annotation 1 eggNOG_homology_search 7 error_correction 7 filter_by_coverage 1 filter_genes 7 finalize_contigs 7 finalize_sample_qc 7 find_16S 1 first_dereplication 1 gene_subsets 1 genecatalog 1 genomes 1 get_all_16S 1 get_all_bins 7 get_bins 7 get_contig_coverage_from_bb 7 get_contigs_from_gene_names 1 get_genome_for_cat 1 get_genomes2cluster 7 get_maxbin_cluster_attribution 7 get_metabat_depth_file 1 get_quality_for_dRep_from_checkm 35 get_read_stats 1 get_rep_proteins 21 get_unique_bin_ids 21 get_unique_cluster_attribution 21 init_pre_assembly_processing 1 initialize_checkm 7 initialize_qc 7 maxbin 7 merge_pairs 1 merge_taxonomy 7 metabat 7 pileup 7 pileup_MAGs 7 pileup_for_binning 7 pileup_prefilter 7 predict_genes 1 predict_genes_genomes 1 qc 7 qcreads 7 rename_contigs 1 rename_gene_catalog 1 rename_genomes 7 rename_megahit_output 1 rename_protein_catalog 1 run_all_checkm_lineage_wf 7 run_checkm_lineage_wf 8 run_checkm_tree_qa 7 run_concoct 7 run_das_tool 7 run_decontamination 7 run_megahit 1 second_dereplication 7 write_read_counts 459

[Tue Mar 5 05:34:36 2019] rule initialize_qc: input: /data/work/24samples/RRBS_Metagenomes/MM14_SB_R1.fastq.gz, /data/work/24samples/RRBS_Metagenomes/MM14_SB_R2.fastq.gz output: MM14-SB/sequence_quality_control/MM14-SB_raw_R1.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_raw_R2.fastq.gz log: MM14-SB/logs/QC/init.log jobid: 222 wildcards: sample=MM14-SB priority: 80 threads: 16 resources: mem=32, java_mem=27

    reformat.sh in=/data/work/24samples/RRBS_Metagenomes/MM14_SB_R1.fastq.gz in2=/data/work/24samples/RRBS_Metagenomes/MM14_SB_R2.fastq.gz             interleaved=f             out1=MM14-SB/sequence_quality_control/MM14-SB_raw_R1.fastq.gz out2=MM14-SB/sequence_quality_control/MM14-SB_raw_R2.fastq.gz             iupacToN=t             touppercase=t             qout=33             overwrite=true             verifypaired=t             addslash=t             trimreaddescription=t             threads=16             -Xmx27G 2> MM14-SB/logs/QC/init.log

Activating conda environment: /databases/conda_envs/81c7173e [Tue Mar 5 05:36:31 2019] Finished job 222. 1 of 459 steps (0.22%) done

[Tue Mar 5 05:36:31 2019] rule initialize_qc: input: /data/work/24samples/RRBS_Metagenomes/MM13_SB_R1.fastq.gz, /data/work/24samples/RRBS_Metagenomes/MM13_SB_R2.fastq.gz output: MM13-SB/sequence_quality_control/MM13-SB_raw_R1.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_raw_R2.fastq.gz log: MM13-SB/logs/QC/init.log jobid: 216 wildcards: sample=MM13-SB priority: 80 threads: 16 resources: mem=32, java_mem=27

    reformat.sh in=/data/work/24samples/RRBS_Metagenomes/MM13_SB_R1.fastq.gz in2=/data/work/24samples/RRBS_Metagenomes/MM13_SB_R2.fastq.gz             interleaved=f             out1=MM13-SB/sequence_quality_control/MM13-SB_raw_R1.fastq.gz out2=MM13-SB/sequence_quality_control/MM13-SB_raw_R2.fastq.gz             iupacToN=t             touppercase=t             qout=33             overwrite=true             verifypaired=t             addslash=t             trimreaddescription=t             threads=16             -Xmx27G 2> MM13-SB/logs/QC/init.log

Activating conda environment: /databases/conda_envs/81c7173e [Tue Mar 5 05:38:26 2019] Finished job 216. 2 of 459 steps (0.44%) done

[Tue Mar 5 05:38:26 2019] rule initialize_qc: input: /data/work/24samples/RRBS_Metagenomes/MM15_SB_R1.fastq.gz, /data/work/24samples/RRBS_Metagenomes/MM15_SB_R2.fastq.gz output: MM15-SB/sequence_quality_control/MM15-SB_raw_R1.fastq.gz, MM15-SB/sequence_quality_control/MM15-SB_raw_R2.fastq.gz log: MM15-SB/logs/QC/init.log jobid: 231 wildcards: sample=MM15-SB priority: 80 threads: 16 resources: mem=32, java_mem=27

    reformat.sh in=/data/work/24samples/RRBS_Metagenomes/MM15_SB_R1.fastq.gz in2=/data/work/24samples/RRBS_Metagenomes/MM15_SB_R2.fastq.gz             interleaved=f             out1=MM15-SB/sequence_quality_control/MM15-SB_raw_R1.fastq.gz out2=MM15-SB/sequence_quality_control/MM15-SB_raw_R2.fastq.gz             iupacToN=t             touppercase=t             qout=33             overwrite=true             verifypaired=t             addslash=t             trimreaddescription=t             threads=16             -Xmx27G 2> MM15-SB/logs/QC/init.log

Activating conda environment: /databases/conda_envs/81c7173e [Tue Mar 5 05:40:17 2019] Finished job 231. 3 of 459 steps (0.65%) done

[Tue Mar 5 05:40:17 2019] rule initialize_qc: input: /data/work/24samples/RRBS_Metagenomes/MM24_B_R1.fastq.gz, /data/work/24samples/RRBS_Metagenomes/MM24_B_R2.fastq.gz output: MM24-B/sequence_quality_control/MM24-B_raw_R1.fastq.gz, MM24-B/sequence_quality_control/MM24-B_raw_R2.fastq.gz log: MM24-B/logs/QC/init.log jobid: 234 wildcards: sample=MM24-B priority: 80 threads: 16 resources: mem=32, java_mem=27

    reformat.sh in=/data/work/24samples/RRBS_Metagenomes/MM24_B_R1.fastq.gz in2=/data/work/24samples/RRBS_Metagenomes/MM24_B_R2.fastq.gz             interleaved=f             out1=MM24-B/sequence_quality_control/MM24-B_raw_R1.fastq.gz out2=MM24-B/sequence_quality_control/MM24-B_raw_R2.fastq.gz             iupacToN=t             touppercase=t             qout=33             overwrite=true             verifypaired=t             addslash=t             trimreaddescription=t             threads=16             -Xmx27G 2> MM24-B/logs/QC/init.log

Activating conda environment: /databases/conda_envs/81c7173e [Tue Mar 5 05:42:06 2019] Finished job 234. 4 of 459 steps (0.87%) done

[Tue Mar 5 05:42:06 2019] rule initialize_qc: input: /data/work/24samples/RRBS_Metagenomes/MM11_SB_R1.fastq.gz, /data/work/24samples/RRBS_Metagenomes/MM11_SB_R2.fastq.gz output: MM11-SB/sequence_quality_control/MM11-SB_raw_R1.fastq.gz, MM11-SB/sequence_quality_control/MM11-SB_raw_R2.fastq.gz log: MM11-SB/logs/QC/init.log jobid: 225 wildcards: sample=MM11-SB priority: 80 threads: 16 resources: mem=32, java_mem=27

    reformat.sh in=/data/work/24samples/RRBS_Metagenomes/MM11_SB_R1.fastq.gz in2=/data/work/24samples/RRBS_Metagenomes/MM11_SB_R2.fastq.gz             interleaved=f             out1=MM11-SB/sequence_quality_control/MM11-SB_raw_R1.fastq.gz out2=MM11-SB/sequence_quality_control/MM11-SB_raw_R2.fastq.gz             iupacToN=t             touppercase=t             qout=33             overwrite=true             verifypaired=t             addslash=t             trimreaddescription=t             threads=16             -Xmx27G 2> MM11-SB/logs/QC/init.log

Activating conda environment: /databases/conda_envs/81c7173e [Tue Mar 5 05:43:47 2019] Finished job 225. 5 of 459 steps (1%) done

[Tue Mar 5 05:43:47 2019] rule initialize_qc: input: /data/work/24samples/RRBS_Metagenomes/MM10_SB_R1.fastq.gz, /data/work/24samples/RRBS_Metagenomes/MM10_SB_R2.fastq.gz output: MM10-SB/sequence_quality_control/MM10-SB_raw_R1.fastq.gz, MM10-SB/sequence_quality_control/MM10-SB_raw_R2.fastq.gz log: MM10-SB/logs/QC/init.log jobid: 228 wildcards: sample=MM10-SB priority: 80 threads: 16 resources: mem=32, java_mem=27

    reformat.sh in=/data/work/24samples/RRBS_Metagenomes/MM10_SB_R1.fastq.gz in2=/data/work/24samples/RRBS_Metagenomes/MM10_SB_R2.fastq.gz             interleaved=f             out1=MM10-SB/sequence_quality_control/MM10-SB_raw_R1.fastq.gz out2=MM10-SB/sequence_quality_control/MM10-SB_raw_R2.fastq.gz             iupacToN=t             touppercase=t             qout=33             overwrite=true             verifypaired=t             addslash=t             trimreaddescription=t             threads=16             -Xmx27G 2> MM10-SB/logs/QC/init.log

Activating conda environment: /databases/conda_envs/81c7173e [Tue Mar 5 05:45:31 2019] Finished job 228. 6 of 459 steps (1%) done

[Tue Mar 5 05:45:31 2019] rule initialize_qc: input: /data/work/24samples/RRBS_Metagenomes/MM12_SB_R1.fastq.gz, /data/work/24samples/RRBS_Metagenomes/MM12_SB_R2.fastq.gz output: MM12-SB/sequence_quality_control/MM12-SB_raw_R1.fastq.gz, MM12-SB/sequence_quality_control/MM12-SB_raw_R2.fastq.gz log: MM12-SB/logs/QC/init.log jobid: 219 wildcards: sample=MM12-SB priority: 80 threads: 16 resources: mem=32, java_mem=27

    reformat.sh in=/data/work/24samples/RRBS_Metagenomes/MM12_SB_R1.fastq.gz in2=/data/work/24samples/RRBS_Metagenomes/MM12_SB_R2.fastq.gz             interleaved=f             out1=MM12-SB/sequence_quality_control/MM12-SB_raw_R1.fastq.gz out2=MM12-SB/sequence_quality_control/MM12-SB_raw_R2.fastq.gz             iupacToN=t             touppercase=t             qout=33             overwrite=true             verifypaired=t             addslash=t             trimreaddescription=t             threads=16             -Xmx27G 2> MM12-SB/logs/QC/init.log

Activating conda environment: /databases/conda_envs/81c7173e [Tue Mar 5 05:47:04 2019] Finished job 219. 7 of 459 steps (2%) done

[Tue Mar 5 05:47:04 2019] rule get_read_stats: input: MM14-SB/sequence_quality_control/MM14-SB_raw_R1.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_raw_R2.fastq.gz output: MM14-SB/sequence_quality_control/read_stats/raw.zip, MM14-SB/sequence_quality_control/read_stats/raw_read_counts.tsv log: MM14-SB/logs/QC/read_stats/raw.log jobid: 60 wildcards: sample=MM14-SB, step=raw priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 05:49:20 2019] Finished job 60. 8 of 459 steps (2%) done

[Tue Mar 5 05:49:20 2019] rule get_read_stats: input: MM13-SB/sequence_quality_control/MM13-SB_raw_R1.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_raw_R2.fastq.gz output: MM13-SB/sequence_quality_control/read_stats/raw.zip, MM13-SB/sequence_quality_control/read_stats/raw_read_counts.tsv log: MM13-SB/logs/QC/read_stats/raw.log jobid: 44 wildcards: sample=MM13-SB, step=raw priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 05:51:27 2019] Finished job 44. 9 of 459 steps (2%) done

[Tue Mar 5 05:51:27 2019] rule get_read_stats: input: MM15-SB/sequence_quality_control/MM15-SB_raw_R1.fastq.gz, MM15-SB/sequence_quality_control/MM15-SB_raw_R2.fastq.gz output: MM15-SB/sequence_quality_control/read_stats/raw.zip, MM15-SB/sequence_quality_control/read_stats/raw_read_counts.tsv log: MM15-SB/logs/QC/read_stats/raw.log jobid: 84 wildcards: sample=MM15-SB, step=raw priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 05:53:34 2019] Finished job 84. 10 of 459 steps (2%) done

[Tue Mar 5 05:53:34 2019] rule get_read_stats: input: MM24-B/sequence_quality_control/MM24-B_raw_R1.fastq.gz, MM24-B/sequence_quality_control/MM24-B_raw_R2.fastq.gz output: MM24-B/sequence_quality_control/read_stats/raw.zip, MM24-B/sequence_quality_control/read_stats/raw_read_counts.tsv log: MM24-B/logs/QC/read_stats/raw.log jobid: 92 wildcards: sample=MM24-B, step=raw priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 05:55:39 2019] Finished job 92. 11 of 459 steps (2%) done

[Tue Mar 5 05:55:39 2019] rule get_read_stats: input: MM11-SB/sequence_quality_control/MM11-SB_raw_R1.fastq.gz, MM11-SB/sequence_quality_control/MM11-SB_raw_R2.fastq.gz output: MM11-SB/sequence_quality_control/read_stats/raw.zip, MM11-SB/sequence_quality_control/read_stats/raw_read_counts.tsv log: MM11-SB/logs/QC/read_stats/raw.log jobid: 68 wildcards: sample=MM11-SB, step=raw priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 05:57:36 2019] Finished job 68. 12 of 459 steps (3%) done

[Tue Mar 5 05:57:36 2019] rule get_read_stats: input: MM10-SB/sequence_quality_control/MM10-SB_raw_R1.fastq.gz, MM10-SB/sequence_quality_control/MM10-SB_raw_R2.fastq.gz output: MM10-SB/sequence_quality_control/read_stats/raw.zip, MM10-SB/sequence_quality_control/read_stats/raw_read_counts.tsv log: MM10-SB/logs/QC/read_stats/raw.log jobid: 76 wildcards: sample=MM10-SB, step=raw priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 05:59:30 2019] Finished job 76. 13 of 459 steps (3%) done

[Tue Mar 5 05:59:30 2019] rule get_read_stats: input: MM12-SB/sequence_quality_control/MM12-SB_raw_R1.fastq.gz, MM12-SB/sequence_quality_control/MM12-SB_raw_R2.fastq.gz output: MM12-SB/sequence_quality_control/read_stats/raw.zip, MM12-SB/sequence_quality_control/read_stats/raw_read_counts.tsv log: MM12-SB/logs/QC/read_stats/raw.log jobid: 52 wildcards: sample=MM12-SB, step=raw priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 06:01:11 2019] Finished job 52. 14 of 459 steps (3%) done

[Tue Mar 5 06:01:11 2019] rule deduplicate_reads: input: MM14-SB/sequence_quality_control/MM14-SB_raw_R1.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_raw_R2.fastq.gz output: MM14-SB/sequence_quality_control/MM14-SB_deduplicated_R1.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_deduplicated_R2.fastq.gz log: MM14-SB/logs/QC/deduplicate.log jobid: 221 benchmark: logs/benchmarks/QC/deduplicate/MM14-SB.txt wildcards: sample=MM14-SB threads: 16 resources: mem=32, java_mem=27

        clumpify.sh                 in=MM14-SB/sequence_quality_control/MM14-SB_raw_R1.fastq.gz in2=MM14-SB/sequence_quality_control/MM14-SB_raw_R2.fastq.gz                 out1=MM14-SB/sequence_quality_control/MM14-SB_deduplicated_R1.fastq.gz out2=MM14-SB/sequence_quality_control/MM14-SB_deduplicated_R2.fastq.gz                 overwrite=true                dedupe=t                 dupesubs=2                 optical=f                threads=16                 -Xmx27G 2> MM14-SB/logs/QC/deduplicate.log

Activating conda environment: /databases/conda_envs/81c7173e Removing temporary output file MM14-SB/sequence_quality_control/MM14-SB_raw_R1.fastq.gz. Removing temporary output file MM14-SB/sequence_quality_control/MM14-SB_raw_R2.fastq.gz. [Tue Mar 5 06:03:46 2019] Finished job 221. 15 of 459 steps (3%) done

[Tue Mar 5 06:03:46 2019] rule get_read_stats: input: MM14-SB/sequence_quality_control/MM14-SB_deduplicated_R1.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_deduplicated_R2.fastq.gz output: MM14-SB/sequence_quality_control/read_stats/deduplicated.zip, MM14-SB/sequence_quality_control/read_stats/deduplicated_read_counts.tsv log: MM14-SB/logs/QC/read_stats/deduplicated.log jobid: 61 wildcards: sample=MM14-SB, step=deduplicated priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 06:05:56 2019] Finished job 61. 16 of 459 steps (3%) done

[Tue Mar 5 06:05:56 2019] rule deduplicate_reads: input: MM13-SB/sequence_quality_control/MM13-SB_raw_R1.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_raw_R2.fastq.gz output: MM13-SB/sequence_quality_control/MM13-SB_deduplicated_R1.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_deduplicated_R2.fastq.gz log: MM13-SB/logs/QC/deduplicate.log jobid: 214 benchmark: logs/benchmarks/QC/deduplicate/MM13-SB.txt wildcards: sample=MM13-SB threads: 16 resources: mem=32, java_mem=27

        clumpify.sh                 in=MM13-SB/sequence_quality_control/MM13-SB_raw_R1.fastq.gz in2=MM13-SB/sequence_quality_control/MM13-SB_raw_R2.fastq.gz                 out1=MM13-SB/sequence_quality_control/MM13-SB_deduplicated_R1.fastq.gz out2=MM13-SB/sequence_quality_control/MM13-SB_deduplicated_R2.fastq.gz                 overwrite=true                dedupe=t                 dupesubs=2                 optical=f                threads=16                 -Xmx27G 2> MM13-SB/logs/QC/deduplicate.log

Activating conda environment: /databases/conda_envs/81c7173e Removing temporary output file MM13-SB/sequence_quality_control/MM13-SB_raw_R1.fastq.gz. Removing temporary output file MM13-SB/sequence_quality_control/MM13-SB_raw_R2.fastq.gz. [Tue Mar 5 06:08:13 2019] Finished job 214. 17 of 459 steps (4%) done

[Tue Mar 5 06:08:13 2019] rule get_read_stats: input: MM13-SB/sequence_quality_control/MM13-SB_deduplicated_R1.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_deduplicated_R2.fastq.gz output: MM13-SB/sequence_quality_control/read_stats/deduplicated.zip, MM13-SB/sequence_quality_control/read_stats/deduplicated_read_counts.tsv log: MM13-SB/logs/QC/read_stats/deduplicated.log jobid: 45 wildcards: sample=MM13-SB, step=deduplicated priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 06:10:14 2019] Finished job 45. 18 of 459 steps (4%) done

[Tue Mar 5 06:10:14 2019] rule deduplicate_reads: input: MM15-SB/sequence_quality_control/MM15-SB_raw_R1.fastq.gz, MM15-SB/sequence_quality_control/MM15-SB_raw_R2.fastq.gz output: MM15-SB/sequence_quality_control/MM15-SB_deduplicated_R1.fastq.gz, MM15-SB/sequence_quality_control/MM15-SB_deduplicated_R2.fastq.gz log: MM15-SB/logs/QC/deduplicate.log jobid: 230 benchmark: logs/benchmarks/QC/deduplicate/MM15-SB.txt wildcards: sample=MM15-SB threads: 16 resources: mem=32, java_mem=27

        clumpify.sh                 in=MM15-SB/sequence_quality_control/MM15-SB_raw_R1.fastq.gz in2=MM15-SB/sequence_quality_control/MM15-SB_raw_R2.fastq.gz                 out1=MM15-SB/sequence_quality_control/MM15-SB_deduplicated_R1.fastq.gz out2=MM15-SB/sequence_quality_control/MM15-SB_deduplicated_R2.fastq.gz                 overwrite=true                dedupe=t                 dupesubs=2                 optical=f                threads=16                 -Xmx27G 2> MM15-SB/logs/QC/deduplicate.log

Activating conda environment: /databases/conda_envs/81c7173e Removing temporary output file MM15-SB/sequence_quality_control/MM15-SB_raw_R1.fastq.gz. Removing temporary output file MM15-SB/sequence_quality_control/MM15-SB_raw_R2.fastq.gz. [Tue Mar 5 06:12:21 2019] Finished job 230. 19 of 459 steps (4%) done

[Tue Mar 5 06:12:21 2019] rule get_read_stats: input: MM15-SB/sequence_quality_control/MM15-SB_deduplicated_R1.fastq.gz, MM15-SB/sequence_quality_control/MM15-SB_deduplicated_R2.fastq.gz output: MM15-SB/sequence_quality_control/read_stats/deduplicated.zip, MM15-SB/sequence_quality_control/read_stats/deduplicated_read_counts.tsv log: MM15-SB/logs/QC/read_stats/deduplicated.log jobid: 85 wildcards: sample=MM15-SB, step=deduplicated priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 06:14:22 2019] Finished job 85. 20 of 459 steps (4%) done

[Tue Mar 5 06:14:22 2019] rule deduplicate_reads: input: MM24-B/sequence_quality_control/MM24-B_raw_R1.fastq.gz, MM24-B/sequence_quality_control/MM24-B_raw_R2.fastq.gz output: MM24-B/sequence_quality_control/MM24-B_deduplicated_R1.fastq.gz, MM24-B/sequence_quality_control/MM24-B_deduplicated_R2.fastq.gz log: MM24-B/logs/QC/deduplicate.log jobid: 233 benchmark: logs/benchmarks/QC/deduplicate/MM24-B.txt wildcards: sample=MM24-B threads: 16 resources: mem=32, java_mem=27

        clumpify.sh                 in=MM24-B/sequence_quality_control/MM24-B_raw_R1.fastq.gz in2=MM24-B/sequence_quality_control/MM24-B_raw_R2.fastq.gz                 out1=MM24-B/sequence_quality_control/MM24-B_deduplicated_R1.fastq.gz out2=MM24-B/sequence_quality_control/MM24-B_deduplicated_R2.fastq.gz                 overwrite=true                dedupe=t                 dupesubs=2                 optical=f                threads=16                 -Xmx27G 2> MM24-B/logs/QC/deduplicate.log

Activating conda environment: /databases/conda_envs/81c7173e Removing temporary output file MM24-B/sequence_quality_control/MM24-B_raw_R2.fastq.gz. Removing temporary output file MM24-B/sequence_quality_control/MM24-B_raw_R1.fastq.gz. [Tue Mar 5 06:15:57 2019] Finished job 233. 21 of 459 steps (5%) done

[Tue Mar 5 06:15:57 2019] rule get_read_stats: input: MM24-B/sequence_quality_control/MM24-B_deduplicated_R1.fastq.gz, MM24-B/sequence_quality_control/MM24-B_deduplicated_R2.fastq.gz output: MM24-B/sequence_quality_control/read_stats/deduplicated.zip, MM24-B/sequence_quality_control/read_stats/deduplicated_read_counts.tsv log: MM24-B/logs/QC/read_stats/deduplicated.log jobid: 93 wildcards: sample=MM24-B, step=deduplicated priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 06:16:58 2019] Finished job 93. 22 of 459 steps (5%) done

[Tue Mar 5 06:16:58 2019] rule deduplicate_reads: input: MM11-SB/sequence_quality_control/MM11-SB_raw_R1.fastq.gz, MM11-SB/sequence_quality_control/MM11-SB_raw_R2.fastq.gz output: MM11-SB/sequence_quality_control/MM11-SB_deduplicated_R1.fastq.gz, MM11-SB/sequence_quality_control/MM11-SB_deduplicated_R2.fastq.gz log: MM11-SB/logs/QC/deduplicate.log jobid: 224 benchmark: logs/benchmarks/QC/deduplicate/MM11-SB.txt wildcards: sample=MM11-SB threads: 16 resources: mem=32, java_mem=27

        clumpify.sh                 in=MM11-SB/sequence_quality_control/MM11-SB_raw_R1.fastq.gz in2=MM11-SB/sequence_quality_control/MM11-SB_raw_R2.fastq.gz                 out1=MM11-SB/sequence_quality_control/MM11-SB_deduplicated_R1.fastq.gz out2=MM11-SB/sequence_quality_control/MM11-SB_deduplicated_R2.fastq.gz                 overwrite=true                dedupe=t                 dupesubs=2                 optical=f                threads=16                 -Xmx27G 2> MM11-SB/logs/QC/deduplicate.log

Activating conda environment: /databases/conda_envs/81c7173e Removing temporary output file MM11-SB/sequence_quality_control/MM11-SB_raw_R2.fastq.gz. Removing temporary output file MM11-SB/sequence_quality_control/MM11-SB_raw_R1.fastq.gz. [Tue Mar 5 06:18:50 2019] Finished job 224. 23 of 459 steps (5%) done

[Tue Mar 5 06:18:50 2019] rule get_read_stats: input: MM11-SB/sequence_quality_control/MM11-SB_deduplicated_R1.fastq.gz, MM11-SB/sequence_quality_control/MM11-SB_deduplicated_R2.fastq.gz output: MM11-SB/sequence_quality_control/read_stats/deduplicated.zip, MM11-SB/sequence_quality_control/read_stats/deduplicated_read_counts.tsv log: MM11-SB/logs/QC/read_stats/deduplicated.log jobid: 69 wildcards: sample=MM11-SB, step=deduplicated priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 06:20:36 2019] Finished job 69. 24 of 459 steps (5%) done

[Tue Mar 5 06:20:36 2019] rule deduplicate_reads: input: MM10-SB/sequence_quality_control/MM10-SB_raw_R1.fastq.gz, MM10-SB/sequence_quality_control/MM10-SB_raw_R2.fastq.gz output: MM10-SB/sequence_quality_control/MM10-SB_deduplicated_R1.fastq.gz, MM10-SB/sequence_quality_control/MM10-SB_deduplicated_R2.fastq.gz log: MM10-SB/logs/QC/deduplicate.log jobid: 227 benchmark: logs/benchmarks/QC/deduplicate/MM10-SB.txt wildcards: sample=MM10-SB threads: 16 resources: mem=32, java_mem=27

        clumpify.sh                 in=MM10-SB/sequence_quality_control/MM10-SB_raw_R1.fastq.gz in2=MM10-SB/sequence_quality_control/MM10-SB_raw_R2.fastq.gz                 out1=MM10-SB/sequence_quality_control/MM10-SB_deduplicated_R1.fastq.gz out2=MM10-SB/sequence_quality_control/MM10-SB_deduplicated_R2.fastq.gz                 overwrite=true                dedupe=t                 dupesubs=2                 optical=f                threads=16                 -Xmx27G 2> MM10-SB/logs/QC/deduplicate.log

Activating conda environment: /databases/conda_envs/81c7173e Removing temporary output file MM10-SB/sequence_quality_control/MM10-SB_raw_R2.fastq.gz. Removing temporary output file MM10-SB/sequence_quality_control/MM10-SB_raw_R1.fastq.gz. [Tue Mar 5 06:22:33 2019] Finished job 227. 25 of 459 steps (5%) done

[Tue Mar 5 06:22:33 2019] rule get_read_stats: input: MM10-SB/sequence_quality_control/MM10-SB_deduplicated_R1.fastq.gz, MM10-SB/sequence_quality_control/MM10-SB_deduplicated_R2.fastq.gz output: MM10-SB/sequence_quality_control/read_stats/deduplicated.zip, MM10-SB/sequence_quality_control/read_stats/deduplicated_read_counts.tsv log: MM10-SB/logs/QC/read_stats/deduplicated.log jobid: 77 wildcards: sample=MM10-SB, step=deduplicated priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 06:24:23 2019] Finished job 77. 26 of 459 steps (6%) done

[Tue Mar 5 06:24:23 2019] rule deduplicate_reads: input: MM12-SB/sequence_quality_control/MM12-SB_raw_R1.fastq.gz, MM12-SB/sequence_quality_control/MM12-SB_raw_R2.fastq.gz output: MM12-SB/sequence_quality_control/MM12-SB_deduplicated_R1.fastq.gz, MM12-SB/sequence_quality_control/MM12-SB_deduplicated_R2.fastq.gz log: MM12-SB/logs/QC/deduplicate.log jobid: 218 benchmark: logs/benchmarks/QC/deduplicate/MM12-SB.txt wildcards: sample=MM12-SB threads: 16 resources: mem=32, java_mem=27

        clumpify.sh                 in=MM12-SB/sequence_quality_control/MM12-SB_raw_R1.fastq.gz in2=MM12-SB/sequence_quality_control/MM12-SB_raw_R2.fastq.gz                 out1=MM12-SB/sequence_quality_control/MM12-SB_deduplicated_R1.fastq.gz out2=MM12-SB/sequence_quality_control/MM12-SB_deduplicated_R2.fastq.gz                 overwrite=true                dedupe=t                 dupesubs=2                 optical=f                threads=16                 -Xmx27G 2> MM12-SB/logs/QC/deduplicate.log

Activating conda environment: /databases/conda_envs/81c7173e Removing temporary output file MM12-SB/sequence_quality_control/MM12-SB_raw_R2.fastq.gz. Removing temporary output file MM12-SB/sequence_quality_control/MM12-SB_raw_R1.fastq.gz. [Tue Mar 5 06:26:06 2019] Finished job 218. 27 of 459 steps (6%) done

[Tue Mar 5 06:26:06 2019] rule get_read_stats: input: MM12-SB/sequence_quality_control/MM12-SB_deduplicated_R1.fastq.gz, MM12-SB/sequence_quality_control/MM12-SB_deduplicated_R2.fastq.gz output: MM12-SB/sequence_quality_control/read_stats/deduplicated.zip, MM12-SB/sequence_quality_control/read_stats/deduplicated_read_counts.tsv log: MM12-SB/logs/QC/read_stats/deduplicated.log jobid: 53 wildcards: sample=MM12-SB, step=deduplicated priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 06:27:43 2019] Finished job 53. 28 of 459 steps (6%) done

[Tue Mar 5 06:27:43 2019] rule apply_quality_filter: input: MM14-SB/sequence_quality_control/MM14-SB_deduplicated_R1.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_deduplicated_R2.fastq.gz, /databases/adapters.fa output: MM14-SB/sequence_quality_control/MM14-SB_filtered_R1.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_filtered_R2.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_filtered_se.fastq.gz, MM14-SB/logs/MM14-SB_quality_filtering_stats.txt log: MM14-SB/logs/QC/quality_filter.log jobid: 59 benchmark: logs/benchmarks/QC/quality_filter/MM14-SB.txt wildcards: sample=MM14-SB threads: 16 resources: mem=32, java_mem=27

    bbduk.sh in=MM14-SB/sequence_quality_control/MM14-SB_deduplicated_R1.fastq.gz in2=MM14-SB/sequence_quality_control/MM14-SB_deduplicated_R2.fastq.gz             ref=/databases/adapters.fa             interleaved=f             out=MM14-SB/sequence_quality_control/MM14-SB_filtered_R1.fastq.gz out2=MM14-SB/sequence_quality_control/MM14-SB_filtered_R2.fastq.gz outs=MM14-SB/sequence_quality_control/MM14-SB_filtered_se.fastq.gz             stats=MM14-SB/logs/MM14-SB_quality_filtering_stats.txt             overwrite=true             qout=33             trd=t             hdist=1             k=27             ktrim=r             mink=8             trimq=10             qtrim=rl             threads=16             minlength=51             maxns=-1             minbasefrequency=0.05             ecco=t             prealloc=t             -Xmx27G 2> MM14-SB/logs/QC/quality_filter.log

Activating conda environment: /databases/conda_envs/81c7173e Removing temporary output file MM14-SB/sequence_quality_control/MM14-SB_deduplicated_R1.fastq.gz. Removing temporary output file MM14-SB/sequence_quality_control/MM14-SB_deduplicated_R2.fastq.gz. [Tue Mar 5 06:30:19 2019] Finished job 59. 29 of 459 steps (6%) done

[Tue Mar 5 06:30:19 2019] rule get_read_stats: input: MM14-SB/sequence_quality_control/MM14-SB_filtered_R1.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_filtered_R2.fastq.gz output: MM14-SB/sequence_quality_control/read_stats/filtered.zip, MM14-SB/sequence_quality_control/read_stats/filtered_read_counts.tsv log: MM14-SB/logs/QC/read_stats/filtered.log jobid: 62 wildcards: sample=MM14-SB, step=filtered priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 06:32:24 2019] Finished job 62. 30 of 459 steps (7%) done

[Tue Mar 5 06:32:24 2019] rule apply_quality_filter: input: MM13-SB/sequence_quality_control/MM13-SB_deduplicated_R1.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_deduplicated_R2.fastq.gz, /databases/adapters.fa output: MM13-SB/sequence_quality_control/MM13-SB_filtered_R1.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_filtered_R2.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_filtered_se.fastq.gz, MM13-SB/logs/MM13-SB_quality_filtering_stats.txt log: MM13-SB/logs/QC/quality_filter.log jobid: 43 benchmark: logs/benchmarks/QC/quality_filter/MM13-SB.txt wildcards: sample=MM13-SB threads: 16 resources: mem=32, java_mem=27

    bbduk.sh in=MM13-SB/sequence_quality_control/MM13-SB_deduplicated_R1.fastq.gz in2=MM13-SB/sequence_quality_control/MM13-SB_deduplicated_R2.fastq.gz             ref=/databases/adapters.fa             interleaved=f             out=MM13-SB/sequence_quality_control/MM13-SB_filtered_R1.fastq.gz out2=MM13-SB/sequence_quality_control/MM13-SB_filtered_R2.fastq.gz outs=MM13-SB/sequence_quality_control/MM13-SB_filtered_se.fastq.gz             stats=MM13-SB/logs/MM13-SB_quality_filtering_stats.txt             overwrite=true             qout=33             trd=t             hdist=1             k=27             ktrim=r             mink=8             trimq=10             qtrim=rl             threads=16             minlength=51             maxns=-1             minbasefrequency=0.05             ecco=t             prealloc=t             -Xmx27G 2> MM13-SB/logs/QC/quality_filter.log

Activating conda environment: /databases/conda_envs/81c7173e Removing temporary output file MM13-SB/sequence_quality_control/MM13-SB_deduplicated_R1.fastq.gz. Removing temporary output file MM13-SB/sequence_quality_control/MM13-SB_deduplicated_R2.fastq.gz. [Tue Mar 5 06:34:47 2019] Finished job 43. 31 of 459 steps (7%) done

[Tue Mar 5 06:34:47 2019] rule get_read_stats: input: MM13-SB/sequence_quality_control/MM13-SB_filtered_R1.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_filtered_R2.fastq.gz output: MM13-SB/sequence_quality_control/read_stats/filtered.zip, MM13-SB/sequence_quality_control/read_stats/filtered_read_counts.tsv log: MM13-SB/logs/QC/read_stats/filtered.log jobid: 46 wildcards: sample=MM13-SB, step=filtered priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 06:36:48 2019] Finished job 46. 32 of 459 steps (7%) done

[Tue Mar 5 06:36:48 2019] rule apply_quality_filter: input: MM15-SB/sequence_quality_control/MM15-SB_deduplicated_R1.fastq.gz, MM15-SB/sequence_quality_control/MM15-SB_deduplicated_R2.fastq.gz, /databases/adapters.fa output: MM15-SB/sequence_quality_control/MM15-SB_filtered_R1.fastq.gz, MM15-SB/sequence_quality_control/MM15-SB_filtered_R2.fastq.gz, MM15-SB/sequence_quality_control/MM15-SB_filtered_se.fastq.gz, MM15-SB/logs/MM15-SB_quality_filtering_stats.txt log: MM15-SB/logs/QC/quality_filter.log jobid: 83 benchmark: logs/benchmarks/QC/quality_filter/MM15-SB.txt wildcards: sample=MM15-SB threads: 16 resources: mem=32, java_mem=27

    bbduk.sh in=MM15-SB/sequence_quality_control/MM15-SB_deduplicated_R1.fastq.gz in2=MM15-SB/sequence_quality_control/MM15-SB_deduplicated_R2.fastq.gz             ref=/databases/adapters.fa             interleaved=f             out=MM15-SB/sequence_quality_control/MM15-SB_filtered_R1.fastq.gz out2=MM15-SB/sequence_quality_control/MM15-SB_filtered_R2.fastq.gz outs=MM15-SB/sequence_quality_control/MM15-SB_filtered_se.fastq.gz             stats=MM15-SB/logs/MM15-SB_quality_filtering_stats.txt             overwrite=true             qout=33             trd=t             hdist=1             k=27             ktrim=r             mink=8             trimq=10             qtrim=rl             threads=16             minlength=51             maxns=-1             minbasefrequency=0.05             ecco=t             prealloc=t             -Xmx27G 2> MM15-SB/logs/QC/quality_filter.log

Activating conda environment: /databases/conda_envs/81c7173e Removing temporary output file MM15-SB/sequence_quality_control/MM15-SB_deduplicated_R2.fastq.gz. Removing temporary output file MM15-SB/sequence_quality_control/MM15-SB_deduplicated_R1.fastq.gz. [Tue Mar 5 06:39:05 2019] Finished job 83. 33 of 459 steps (7%) done

[Tue Mar 5 06:39:05 2019] rule get_read_stats: input: MM15-SB/sequence_quality_control/MM15-SB_filtered_R1.fastq.gz, MM15-SB/sequence_quality_control/MM15-SB_filtered_R2.fastq.gz output: MM15-SB/sequence_quality_control/read_stats/filtered.zip, MM15-SB/sequence_quality_control/read_stats/filtered_read_counts.tsv log: MM15-SB/logs/QC/read_stats/filtered.log jobid: 86 wildcards: sample=MM15-SB, step=filtered priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 06:41:00 2019] Finished job 86. 34 of 459 steps (7%) done

[Tue Mar 5 06:41:00 2019] rule apply_quality_filter: input: MM10-SB/sequence_quality_control/MM10-SB_deduplicated_R1.fastq.gz, MM10-SB/sequence_quality_control/MM10-SB_deduplicated_R2.fastq.gz, /databases/adapters.fa output: MM10-SB/sequence_quality_control/MM10-SB_filtered_R1.fastq.gz, MM10-SB/sequence_quality_control/MM10-SB_filtered_R2.fastq.gz, MM10-SB/sequence_quality_control/MM10-SB_filtered_se.fastq.gz, MM10-SB/logs/MM10-SB_quality_filtering_stats.txt log: MM10-SB/logs/QC/quality_filter.log jobid: 75 benchmark: logs/benchmarks/QC/quality_filter/MM10-SB.txt wildcards: sample=MM10-SB threads: 16 resources: mem=32, java_mem=27

    bbduk.sh in=MM10-SB/sequence_quality_control/MM10-SB_deduplicated_R1.fastq.gz in2=MM10-SB/sequence_quality_control/MM10-SB_deduplicated_R2.fastq.gz             ref=/databases/adapters.fa             interleaved=f             out=MM10-SB/sequence_quality_control/MM10-SB_filtered_R1.fastq.gz out2=MM10-SB/sequence_quality_control/MM10-SB_filtered_R2.fastq.gz outs=MM10-SB/sequence_quality_control/MM10-SB_filtered_se.fastq.gz             stats=MM10-SB/logs/MM10-SB_quality_filtering_stats.txt             overwrite=true             qout=33             trd=t             hdist=1             k=27             ktrim=r             mink=8             trimq=10             qtrim=rl             threads=16             minlength=51             maxns=-1             minbasefrequency=0.05             ecco=t             prealloc=t             -Xmx27G 2> MM10-SB/logs/QC/quality_filter.log

Activating conda environment: /databases/conda_envs/81c7173e Removing temporary output file MM10-SB/sequence_quality_control/MM10-SB_deduplicated_R2.fastq.gz. Removing temporary output file MM10-SB/sequence_quality_control/MM10-SB_deduplicated_R1.fastq.gz. [Tue Mar 5 06:43:22 2019] Finished job 75. 35 of 459 steps (8%) done

[Tue Mar 5 06:43:22 2019] rule get_read_stats: input: MM10-SB/sequence_quality_control/MM10-SB_filtered_R1.fastq.gz, MM10-SB/sequence_quality_control/MM10-SB_filtered_R2.fastq.gz output: MM10-SB/sequence_quality_control/read_stats/filtered.zip, MM10-SB/sequence_quality_control/read_stats/filtered_read_counts.tsv log: MM10-SB/logs/QC/read_stats/filtered.log jobid: 78 wildcards: sample=MM10-SB, step=filtered priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 06:45:11 2019] Finished job 78. 36 of 459 steps (8%) done

[Tue Mar 5 06:45:11 2019] rule apply_quality_filter: input: MM12-SB/sequence_quality_control/MM12-SB_deduplicated_R1.fastq.gz, MM12-SB/sequence_quality_control/MM12-SB_deduplicated_R2.fastq.gz, /databases/adapters.fa output: MM12-SB/sequence_quality_control/MM12-SB_filtered_R1.fastq.gz, MM12-SB/sequence_quality_control/MM12-SB_filtered_R2.fastq.gz, MM12-SB/sequence_quality_control/MM12-SB_filtered_se.fastq.gz, MM12-SB/logs/MM12-SB_quality_filtering_stats.txt log: MM12-SB/logs/QC/quality_filter.log jobid: 51 benchmark: logs/benchmarks/QC/quality_filter/MM12-SB.txt wildcards: sample=MM12-SB threads: 16 resources: mem=32, java_mem=27

    bbduk.sh in=MM12-SB/sequence_quality_control/MM12-SB_deduplicated_R1.fastq.gz in2=MM12-SB/sequence_quality_control/MM12-SB_deduplicated_R2.fastq.gz             ref=/databases/adapters.fa             interleaved=f             out=MM12-SB/sequence_quality_control/MM12-SB_filtered_R1.fastq.gz out2=MM12-SB/sequence_quality_control/MM12-SB_filtered_R2.fastq.gz outs=MM12-SB/sequence_quality_control/MM12-SB_filtered_se.fastq.gz             stats=MM12-SB/logs/MM12-SB_quality_filtering_stats.txt             overwrite=true             qout=33             trd=t             hdist=1             k=27             ktrim=r             mink=8             trimq=10             qtrim=rl             threads=16             minlength=51             maxns=-1             minbasefrequency=0.05             ecco=t             prealloc=t             -Xmx27G 2> MM12-SB/logs/QC/quality_filter.log

Activating conda environment: /databases/conda_envs/81c7173e Removing temporary output file MM12-SB/sequence_quality_control/MM12-SB_deduplicated_R2.fastq.gz. Removing temporary output file MM12-SB/sequence_quality_control/MM12-SB_deduplicated_R1.fastq.gz. [Tue Mar 5 06:47:11 2019] Finished job 51. 37 of 459 steps (8%) done

[Tue Mar 5 06:47:11 2019] rule get_read_stats: input: MM12-SB/sequence_quality_control/MM12-SB_filtered_R1.fastq.gz, MM12-SB/sequence_quality_control/MM12-SB_filtered_R2.fastq.gz output: MM12-SB/sequence_quality_control/read_stats/filtered.zip, MM12-SB/sequence_quality_control/read_stats/filtered_read_counts.tsv log: MM12-SB/logs/QC/read_stats/filtered.log jobid: 54 wildcards: sample=MM12-SB, step=filtered priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 06:48:37 2019] Finished job 54. 38 of 459 steps (8%) done

[Tue Mar 5 06:48:37 2019] rule apply_quality_filter: input: MM11-SB/sequence_quality_control/MM11-SB_deduplicated_R1.fastq.gz, MM11-SB/sequence_quality_control/MM11-SB_deduplicated_R2.fastq.gz, /databases/adapters.fa output: MM11-SB/sequence_quality_control/MM11-SB_filtered_R1.fastq.gz, MM11-SB/sequence_quality_control/MM11-SB_filtered_R2.fastq.gz, MM11-SB/sequence_quality_control/MM11-SB_filtered_se.fastq.gz, MM11-SB/logs/MM11-SB_quality_filtering_stats.txt log: MM11-SB/logs/QC/quality_filter.log jobid: 67 benchmark: logs/benchmarks/QC/quality_filter/MM11-SB.txt wildcards: sample=MM11-SB threads: 16 resources: mem=32, java_mem=27

    bbduk.sh in=MM11-SB/sequence_quality_control/MM11-SB_deduplicated_R1.fastq.gz in2=MM11-SB/sequence_quality_control/MM11-SB_deduplicated_R2.fastq.gz             ref=/databases/adapters.fa             interleaved=f             out=MM11-SB/sequence_quality_control/MM11-SB_filtered_R1.fastq.gz out2=MM11-SB/sequence_quality_control/MM11-SB_filtered_R2.fastq.gz outs=MM11-SB/sequence_quality_control/MM11-SB_filtered_se.fastq.gz             stats=MM11-SB/logs/MM11-SB_quality_filtering_stats.txt             overwrite=true             qout=33             trd=t             hdist=1             k=27             ktrim=r             mink=8             trimq=10             qtrim=rl             threads=16             minlength=51             maxns=-1             minbasefrequency=0.05             ecco=t             prealloc=t             -Xmx27G 2> MM11-SB/logs/QC/quality_filter.log

Activating conda environment: /databases/conda_envs/81c7173e Removing temporary output file MM11-SB/sequence_quality_control/MM11-SB_deduplicated_R1.fastq.gz. Removing temporary output file MM11-SB/sequence_quality_control/MM11-SB_deduplicated_R2.fastq.gz. [Tue Mar 5 06:50:44 2019] Finished job 67. 39 of 459 steps (8%) done

[Tue Mar 5 06:50:44 2019] rule get_read_stats: input: MM11-SB/sequence_quality_control/MM11-SB_filtered_R1.fastq.gz, MM11-SB/sequence_quality_control/MM11-SB_filtered_R2.fastq.gz output: MM11-SB/sequence_quality_control/read_stats/filtered.zip, MM11-SB/sequence_quality_control/read_stats/filtered_read_counts.tsv log: MM11-SB/logs/QC/read_stats/filtered.log jobid: 70 wildcards: sample=MM11-SB, step=filtered priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 06:52:33 2019] Finished job 70. 40 of 459 steps (9%) done

[Tue Mar 5 06:52:33 2019] rule apply_quality_filter: input: MM24-B/sequence_quality_control/MM24-B_deduplicated_R1.fastq.gz, MM24-B/sequence_quality_control/MM24-B_deduplicated_R2.fastq.gz, /databases/adapters.fa output: MM24-B/sequence_quality_control/MM24-B_filtered_R1.fastq.gz, MM24-B/sequence_quality_control/MM24-B_filtered_R2.fastq.gz, MM24-B/sequence_quality_control/MM24-B_filtered_se.fastq.gz, MM24-B/logs/MM24-B_quality_filtering_stats.txt log: MM24-B/logs/QC/quality_filter.log jobid: 91 benchmark: logs/benchmarks/QC/quality_filter/MM24-B.txt wildcards: sample=MM24-B threads: 16 resources: mem=32, java_mem=27

    bbduk.sh in=MM24-B/sequence_quality_control/MM24-B_deduplicated_R1.fastq.gz in2=MM24-B/sequence_quality_control/MM24-B_deduplicated_R2.fastq.gz             ref=/databases/adapters.fa             interleaved=f             out=MM24-B/sequence_quality_control/MM24-B_filtered_R1.fastq.gz out2=MM24-B/sequence_quality_control/MM24-B_filtered_R2.fastq.gz outs=MM24-B/sequence_quality_control/MM24-B_filtered_se.fastq.gz             stats=MM24-B/logs/MM24-B_quality_filtering_stats.txt             overwrite=true             qout=33             trd=t             hdist=1             k=27             ktrim=r             mink=8             trimq=10             qtrim=rl             threads=16             minlength=51             maxns=-1             minbasefrequency=0.05             ecco=t             prealloc=t             -Xmx27G 2> MM24-B/logs/QC/quality_filter.log

Activating conda environment: /databases/conda_envs/81c7173e Removing temporary output file MM24-B/sequence_quality_control/MM24-B_deduplicated_R2.fastq.gz. Removing temporary output file MM24-B/sequence_quality_control/MM24-B_deduplicated_R1.fastq.gz. [Tue Mar 5 06:54:03 2019] Finished job 91. 41 of 459 steps (9%) done

[Tue Mar 5 06:54:03 2019] rule get_read_stats: input: MM24-B/sequence_quality_control/MM24-B_filtered_R1.fastq.gz, MM24-B/sequence_quality_control/MM24-B_filtered_R2.fastq.gz output: MM24-B/sequence_quality_control/read_stats/filtered.zip, MM24-B/sequence_quality_control/read_stats/filtered_read_counts.tsv log: MM24-B/logs/QC/read_stats/filtered.log jobid: 94 wildcards: sample=MM24-B, step=filtered priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 06:55:03 2019] Finished job 94. 42 of 459 steps (9%) done

[Tue Mar 5 06:55:03 2019] rule initialize_checkm: input: /databases/checkm/taxon_marker_sets.tsv, /databases/checkm/selected_marker_sets.tsv, /databases/checkm/pfam/tigrfam2pfam.tsv, /databases/checkm/pfam/Pfam-A.hmm.dat, /databases/checkm/img/img_metadata.tsv, /databases/checkm/hmms_ssu/SSU_euk.hmm, /databases/checkm/hmms_ssu/SSU_bacteria.hmm, /databases/checkm/hmms_ssu/SSU_archaea.hmm, /databases/checkm/hmms_ssu/createHMMs.py, /databases/checkm/hmms/phylo.hmm.ssi, /databases/checkm/hmms/phylo.hmm, /databases/checkm/hmms/checkm.hmm.ssi, /databases/checkm/hmms/checkm.hmm, /databases/checkm/genome_tree/missing_duplicate_genes_97.tsv, /databases/checkm/genome_tree/missing_duplicate_genes_50.tsv, /databases/checkm/genome_tree/genome_tree.taxonomy.tsv, /databases/checkm/genome_tree/genome_tree_reduced.refpkg/phylomodelJqWx6.json, /databases/checkm/genome_tree/genome_tree_reduced.refpkg/genome_tree.tre, /databases/checkm/genome_tree/genome_tree_reduced.refpkg/genome_tree.log, /databases/checkm/genome_tree/genome_tree_reduced.refpkg/genome_tree.fasta, /databases/checkm/genome_tree/genome_tree_reduced.refpkg/CONTENTS.json, /databases/checkm/genome_tree/genome_tree.metadata.tsv, /databases/checkm/genome_tree/genome_tree_full.refpkg/phylo_modelEcOyPk.json, /databases/checkm/genome_tree/genome_tree_full.refpkg/genome_tree.tre, /databases/checkm/genome_tree/genome_tree_full.refpkg/genome_tree.log, /databases/checkm/genome_tree/genome_tree_full.refpkg/genome_tree.fasta, /databases/checkm/genome_tree/genome_tree_full.refpkg/CONTENTS.json, /databases/checkm/genome_tree/genome_tree.derep.txt, /databases/checkm/.dmanifest, /databases/checkm/distributions/td_dist.txt, /databases/checkm/distributions/gc_dist.txt, /databases/checkm/distributions/cd_dist.txt output: logs/checkm_init.txt log: logs/initialize_checkm.log jobid: 284

    python /opt/conda/lib/python3.6/site-packages/atlas/rules/initialize_checkm.py             /databases/checkm             logs/checkm_init.txt             logs/initialize_checkm.log

Activating conda environment: /databases/conda_envs/420ace10 [Tue Mar 5 06:55:23 2019] Finished job 284. 43 of 459 steps (9%) done

[Tue Mar 5 06:55:23 2019] rule build_decontamination_db: input: /databases/phiX174_virus.fa output: ref/genome/1/summary.txt log: logs/QC/build_decontamination_db.log jobid: 309 threads: 16 resources: mem=32, java_mem=27

        bbsplit.sh -Xmx27G ref_PhiX=/databases/phiX174_virus.fa                 threads=16 k=13 local=t 2> logs/QC/build_decontamination_db.log

Activating conda environment: /databases/conda_envs/81c7173e [Tue Mar 5 06:55:27 2019] Finished job 309. 44 of 459 steps (10%) done

[Tue Mar 5 06:55:27 2019] rule run_decontamination: input: MM14-SB/sequence_quality_control/MM14-SB_filtered_R1.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_filtered_R2.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_filtered_se.fastq.gz, ref/genome/1/summary.txt output: MM14-SB/sequence_quality_control/MM14-SB_clean_R1.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_clean_R2.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_clean_se.fastq.gz, MM14-SB/sequence_quality_control/contaminants/PhiX_R1.fastq.gz, MM14-SB/sequence_quality_control/contaminants/PhiX_R2.fastq.gz, MM14-SB/sequence_quality_control/contaminants/PhiX_se.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_decontamination_reference_stats.txt log: MM14-SB/logs/QC/decontamination.log jobid: 220 benchmark: logs/benchmarks/QC/decontamination/MM14-SB.txt wildcards: sample=MM14-SB threads: 16 resources: mem=32, java_mem=27

        if [ "true" = true ] ; then
            bbsplit.sh in1=MM14-SB/sequence_quality_control/MM14-SB_filtered_R1.fastq.gz in2=MM14-SB/sequence_quality_control/MM14-SB_filtered_R2.fastq.gz                     outu1=MM14-SB/sequence_quality_control/MM14-SB_clean_R1.fastq.gz outu2=MM14-SB/sequence_quality_control/MM14-SB_clean_R2.fastq.gz                     basename="MM14-SB/sequence_quality_control/contaminants/%_R#.fastq.gz"                     maxindel=20 minratio=0.65                     minhits=1 ambiguous=best refstats=MM14-SB/sequence_quality_control/MM14-SB_decontamination_reference_stats.txt                    threads=16 k=13 local=t                     -Xmx27G 2> MM14-SB/logs/QC/decontamination.log
        fi

        bbsplit.sh in=MM14-SB/sequence_quality_control/MM14-SB_filtered_se.fastq.gz                  outu=MM14-SB/sequence_quality_control/MM14-SB_clean_se.fastq.gz                 basename="MM14-SB/sequence_quality_control/contaminants/%_se.fastq.gz"                 maxindel=20 minratio=0.65                 minhits=1 ambiguous=best refstats=MM14-SB/sequence_quality_control/MM14-SB_decontamination_reference_stats.txt append                 interleaved=f threads=16 k=13 local=t                 -Xmx27G 2>> MM14-SB/logs/QC/decontamination.log

Activating conda environment: /databases/conda_envs/81c7173e Removing temporary output file MM14-SB/sequence_quality_control/MM14-SB_filtered_R1.fastq.gz. Removing temporary output file MM14-SB/sequence_quality_control/MM14-SB_filtered_R2.fastq.gz. Removing temporary output file MM14-SB/sequence_quality_control/MM14-SB_filtered_se.fastq.gz. [Tue Mar 5 06:57:03 2019] Finished job 220. 45 of 459 steps (10%) done

[Tue Mar 5 06:57:03 2019] rule get_read_stats: input: MM14-SB/sequence_quality_control/MM14-SB_clean_R1.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_clean_R2.fastq.gz output: MM14-SB/sequence_quality_control/read_stats/clean.zip, MM14-SB/sequence_quality_control/read_stats/clean_read_counts.tsv log: MM14-SB/logs/QC/read_stats/clean.log jobid: 63 wildcards: sample=MM14-SB, step=clean priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 06:59:09 2019] Finished job 63. 46 of 459 steps (10%) done

[Tue Mar 5 06:59:09 2019] localrule qcreads: input: MM14-SB/sequence_quality_control/MM14-SB_clean_R1.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_clean_R2.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_clean_se.fastq.gz output: MM14-SB/sequence_quality_control/MM14-SB_QC_R1.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_QC_R2.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_QC_se.fastq.gz jobid: 58 wildcards: sample=MM14-SB

Removing temporary output file MM14-SB/sequence_quality_control/MM14-SB_clean_R1.fastq.gz. Removing temporary output file MM14-SB/sequence_quality_control/MM14-SB_clean_R2.fastq.gz. Removing temporary output file MM14-SB/sequence_quality_control/MM14-SB_clean_se.fastq.gz. [Tue Mar 5 06:59:15 2019] Finished job 58. 47 of 459 steps (10%) done

[Tue Mar 5 06:59:15 2019] rule get_read_stats: input: MM14-SB/sequence_quality_control/MM14-SB_QC_R1.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_QC_R2.fastq.gz output: MM14-SB/sequence_quality_control/read_stats/QC.zip, MM14-SB/sequence_quality_control/read_stats/QC_read_counts.tsv log: MM14-SB/logs/QC/read_stats/QC.log jobid: 64 wildcards: sample=MM14-SB, step=QC priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 07:01:21 2019] Finished job 64. 48 of 459 steps (10%) done

[Tue Mar 5 07:01:21 2019] rule run_decontamination: input: MM13-SB/sequence_quality_control/MM13-SB_filtered_R1.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_filtered_R2.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_filtered_se.fastq.gz, ref/genome/1/summary.txt output: MM13-SB/sequence_quality_control/MM13-SB_clean_R1.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_clean_R2.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_clean_se.fastq.gz, MM13-SB/sequence_quality_control/contaminants/PhiX_R1.fastq.gz, MM13-SB/sequence_quality_control/contaminants/PhiX_R2.fastq.gz, MM13-SB/sequence_quality_control/contaminants/PhiX_se.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_decontamination_reference_stats.txt log: MM13-SB/logs/QC/decontamination.log jobid: 213 benchmark: logs/benchmarks/QC/decontamination/MM13-SB.txt wildcards: sample=MM13-SB threads: 16 resources: mem=32, java_mem=27

        if [ "true" = true ] ; then
            bbsplit.sh in1=MM13-SB/sequence_quality_control/MM13-SB_filtered_R1.fastq.gz in2=MM13-SB/sequence_quality_control/MM13-SB_filtered_R2.fastq.gz                     outu1=MM13-SB/sequence_quality_control/MM13-SB_clean_R1.fastq.gz outu2=MM13-SB/sequence_quality_control/MM13-SB_clean_R2.fastq.gz                     basename="MM13-SB/sequence_quality_control/contaminants/%_R#.fastq.gz"                     maxindel=20 minratio=0.65                     minhits=1 ambiguous=best refstats=MM13-SB/sequence_quality_control/MM13-SB_decontamination_reference_stats.txt                    threads=16 k=13 local=t                     -Xmx27G 2> MM13-SB/logs/QC/decontamination.log
        fi

        bbsplit.sh in=MM13-SB/sequence_quality_control/MM13-SB_filtered_se.fastq.gz                  outu=MM13-SB/sequence_quality_control/MM13-SB_clean_se.fastq.gz                 basename="MM13-SB/sequence_quality_control/contaminants/%_se.fastq.gz"                 maxindel=20 minratio=0.65                 minhits=1 ambiguous=best refstats=MM13-SB/sequence_quality_control/MM13-SB_decontamination_reference_stats.txt append                 interleaved=f threads=16 k=13 local=t                 -Xmx27G 2>> MM13-SB/logs/QC/decontamination.log

Activating conda environment: /databases/conda_envs/81c7173e Removing temporary output file MM13-SB/sequence_quality_control/MM13-SB_filtered_R1.fastq.gz. Removing temporary output file MM13-SB/sequence_quality_control/MM13-SB_filtered_R2.fastq.gz. Removing temporary output file MM13-SB/sequence_quality_control/MM13-SB_filtered_se.fastq.gz. [Tue Mar 5 07:02:52 2019] Finished job 213. 49 of 459 steps (11%) done

[Tue Mar 5 07:02:52 2019] rule get_read_stats: input: MM13-SB/sequence_quality_control/MM13-SB_clean_R1.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_clean_R2.fastq.gz output: MM13-SB/sequence_quality_control/read_stats/clean.zip, MM13-SB/sequence_quality_control/read_stats/clean_read_counts.tsv log: MM13-SB/logs/QC/read_stats/clean.log jobid: 47 wildcards: sample=MM13-SB, step=clean priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 07:04:50 2019] Finished job 47. 50 of 459 steps (11%) done

[Tue Mar 5 07:04:50 2019] localrule init_pre_assembly_processing: input: MM14-SB/sequence_quality_control/MM14-SB_QC_R1.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_QC_R2.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_QC_se.fastq.gz output: MM14-SB/assembly/reads/QC_se.fastq.gz jobid: 443 wildcards: sample=MM14-SB, fraction=se

[Tue Mar 5 07:04:50 2019] localrule init_pre_assembly_processing: input: MM14-SB/sequence_quality_control/MM14-SB_QC_R1.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_QC_R2.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_QC_se.fastq.gz output: MM14-SB/assembly/reads/QC_R2.fastq.gz jobid: 442 wildcards: sample=MM14-SB, fraction=R2

[Tue Mar 5 07:04:50 2019] localrule qcreads: input: MM13-SB/sequence_quality_control/MM13-SB_clean_R1.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_clean_R2.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_clean_se.fastq.gz output: MM13-SB/sequence_quality_control/MM13-SB_QC_R1.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_QC_R2.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_QC_se.fastq.gz jobid: 42 wildcards: sample=MM13-SB

[Tue Mar 5 07:04:50 2019] localrule init_pre_assembly_processing: input: MM14-SB/sequence_quality_control/MM14-SB_QC_R1.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_QC_R2.fastq.gz, MM14-SB/sequence_quality_control/MM14-SB_QC_se.fastq.gz output: MM14-SB/assembly/reads/QC_R1.fastq.gz jobid: 441 wildcards: sample=MM14-SB, fraction=R1

[Tue Mar 5 07:04:50 2019] localrule write_read_counts: input: MM14-SB/sequence_quality_control/read_stats/raw_read_counts.tsv, MM14-SB/sequence_quality_control/read_stats/deduplicated_read_counts.tsv, MM14-SB/sequence_quality_control/read_stats/filtered_read_counts.tsv, MM14-SB/sequence_quality_control/read_stats/clean_read_counts.tsv, MM14-SB/sequence_quality_control/read_stats/QC_read_counts.tsv output: MM14-SB/sequence_quality_control/read_stats/read_counts.tsv jobid: 237 wildcards: sample=MM14-SB

[Tue Mar 5 07:04:52 2019] Finished job 441. 51 of 459 steps (11%) done [Tue Mar 5 07:04:53 2019] Finished job 442. 52 of 459 steps (11%) done [Tue Mar 5 07:04:53 2019] Finished job 443. 53 of 459 steps (12%) done Removing temporary output file MM14-SB/sequence_quality_control/read_stats/raw_read_counts.tsv. Removing temporary output file MM14-SB/sequence_quality_control/read_stats/deduplicated_read_counts.tsv. Removing temporary output file MM14-SB/sequence_quality_control/read_stats/filtered_read_counts.tsv. Removing temporary output file MM14-SB/sequence_quality_control/read_stats/clean_read_counts.tsv. Removing temporary output file MM14-SB/sequence_quality_control/read_stats/QC_read_counts.tsv. [Tue Mar 5 07:04:53 2019] Finished job 237. 54 of 459 steps (12%) done Removing temporary output file MM13-SB/sequence_quality_control/MM13-SB_clean_R2.fastq.gz. Removing temporary output file MM13-SB/sequence_quality_control/MM13-SB_clean_se.fastq.gz. Removing temporary output file MM13-SB/sequence_quality_control/MM13-SB_clean_R1.fastq.gz. [Tue Mar 5 07:04:56 2019] Finished job 42. 55 of 459 steps (12%) done

[Tue Mar 5 07:04:56 2019] rule get_read_stats: input: MM13-SB/sequence_quality_control/MM13-SB_QC_R1.fastq.gz, MM13-SB/sequence_quality_control/MM13-SB_QC_R2.fastq.gz output: MM13-SB/sequence_quality_control/read_stats/QC.zip, MM13-SB/sequence_quality_control/read_stats/QC_read_counts.tsv log: MM13-SB/logs/QC/read_stats/QC.log jobid: 48 wildcards: sample=MM13-SB, step=QC priority: 30 threads: 16 resources: mem=32, java_mem=27

[Tue Mar 5 07:07:00 2019] Finished job 48. 56 of 459 steps (12%) done

[Tue Mar 5 07:07:00 2019] rule error_correction: input: MM14-SB/assembly/reads/QC_R1.fastq.gz, MM14-SB/assembly/reads/QC_R2.fastq.gz, MM14-SB/assembly/reads/QC_se.fastq.gz output: MM14-SB/assembly/reads/QC.errorcorr_R1.fastq.gz, MM14-SB/assembly/reads/QC.errorcorr_R2.fastq.gz, MM14-SB/assembly/reads/QC.errorcorr_se.fastq.gz log: MM14-SB/logs/assembly/pre_process/error_correction_QC.log jobid: 423 benchmark: logs/benchmarks/assembly/pre_process/MM14-SB_error_correction_QC.txt wildcards: sample=MM14-SB, previous_steps=QC threads: 16 resources: mem=32, java_mem=27

    tadpole.sh -Xmx27G             prealloc=1             in1=MM14-SB/assembly/reads/QC_R1.fastq.gz,MM14-SB/assembly/reads/QC_se.fastq.gz in2=MM14-SB/assembly/reads/QC_R2.fastq.gz             out1=MM14-SB/assembly/reads/QC.errorcorr_R1.fastq.gz,MM14-SB/assembly/reads/QC.errorcorr_se.fastq.gz out2=MM14-SB/assembly/reads/QC.errorcorr_R2.fastq.gz             mode=correct             threads=16             ecc=t ecco=t 2>> MM14-SB/logs/assembly/pre_process/error_correction_QC.log

Activating conda environment: /databases/conda_envs/81c7173e Removing temporary output file MM14-SB/assembly/reads/QC_R1.fastq.gz. Removing temporary output file MM14-SB/assembly/reads/QC_R2.fastq.gz. Removing temporary output file MM14-SB/assembly/reads/QC_se.fastq.gz. [Tue Mar 5 07:11:17 2019] Finished job 423. 57 of 459 steps (12%) done

[Tue Mar 5 07:11:17 2019] rule merge_pairs: input: MM14-SB/assembly/reads/QC.errorcorr_R1.fastq.gz, MM14-SB/assembly/reads/QC.errorcorr_R2.fastq.gz, MM14-SB/assembly/reads/QC.errorcorr_se.fastq.gz output: MM14-SB/assembly/reads/QC.errorcorr.merged_R1.fastq.gz, MM14-SB/assembly/reads/QC.errorcorr.merged_R2.fastq.gz, MM14-SB/assembly/reads/QC.errorcorr.merged_se.fastq.gz, MM14-SB/assembly/reads/QC.errorcorr.merged_me.fastq.gz log: MM14-SB/logs/assembly/pre_process/merge_pairs_QC.errorcorr.log jobid: 401 benchmark: logs/benchmarks/assembly/pre_process/merge_pairs_QC.errorcorr/MM14-SB.txt wildcards: sample=MM14-SB, previous_steps=QC.errorcorr threads: 16 resources: mem=32, java_mem=27

    bbmerge.sh -Xmx27G threads=16             in1=MM14-SB/assembly/reads/QC.errorcorr_R1.fastq.gz in2=MM14-SB/assembly/reads/QC.errorcorr_R2.fastq.gz             outmerged=MM14-SB/assembly/reads/QC.errorcorr.merged_me.fastq.gz             outu=MM14-SB/assembly/reads/QC.errorcorr.merged_R1.fastq.gz outu
SilasK commented 5 years ago

Sorry, I don’t see the logs.

Try to add them directly to https://github.com/metagenome-atlas/atlas/issues/143

I don’t think that lowering the contigs length lower than 200bp is a good Idea.

MCamp91 commented 5 years ago

Metatranscriptome_metaSpades_run.txt Metagenome_run.txt Metatranscriptomics_rnaSpades_run.txt

SilasK commented 5 years ago

@MCamp91 @mykophile could you try out #183