Closed AlessioMilanese closed 8 months ago
Hi
You can find a test data in the repository, https://github.com/yanhui09/laca/blob/master/laca/workflow/resources/data/raw.fastq.gz .
If you set the basecalled_dir
to the path/to/laca/workflow/resources/data
in the config.yaml
, you could do a demo run.
BTW, you probably need to set the n
(under seqkit
config parameters) to e.g., 1000 if you do subsampling on the test data.
Ok, thanks for clarifying.
So I have to run:
# unzip
gunzip raw.fastq.gz
# init config file and check
laca init -b raw.fastq -d /path/to/database
# start analysis
laca run all
Could you provide me a "database" for the -d
option? Is it a fasta file or a directory?
you don't need to gunzip
the fast files.
laca init
will create a config.yaml under your working directory (.
by default). And a defined config.yaml
could be re-used in other runs.
With the generated config.yaml
, you can control the params
to run laca
.
laca
use conda
or docker
(if required special environment) to control the software version. For the first use, it will download the software and database by itself, so it takes some time.
-d
in the lace init
just defines the where you want to store the software and databases. You can set it anywhere as long as you have the permission.
-b
corresponds the directory holding the basecalled fastq or fastq.gz files (Fastq files were generated in batches by guppy). You need to make it to the path to a directory.
In short, it shall be like
laca init -b /path/to/raw.fastq.gz -d /path/to/database
#check the parameters in the generated config file
laca run all
Not sure if I'm doing something wrong. Here's what I'm running.
I have the file in the repo tmp_fastq
:
alessiomilanese:tmp$ docker run -v `pwd`:/home --privileged yanhui09/laca ls tmp_fastq
raw.fastq.gz
I run the init:
alessiomilanese:tmp$ docker run -v `pwd`:/home --privileged yanhui09/laca laca init -b /home/tmp_fastq -d tmp_db
2024-02-01 10:33:28,381 - root - INFO - LACA version: 0+untagged.1.g3131263 (laca.py:436)
2024-02-01 10:33:28,410 - root - INFO - Config file [config.yaml] created in /home. (config.py:196)
and run all:
alessiomilanese:tmp$ docker run -v `pwd`:/home --privileged yanhui09/laca laca run all
2024-02-01 10:33:38,513 - root - INFO - LACA version: 0+untagged.1.g3131263 (laca.py:61)
2024-02-01 10:33:38,538 - root - DEBUG - Executing: snakemake all --directory '/home' --snakefile '/tmp/repo/laca/workflow/Snakefile' --configfile '/home/config.yaml' --use-conda --conda-prefix '/home/tmp_db/conda_envs' --use-singularity --singularity-prefix '/home/tmp_db/singularity_envs' --singularity-args '--bind /tmp/repo/laca/workflow/resources/guppy_barcoding/:/opt/ont/guppy/data/barcoding/,/home/tmp_fastq' --rerun-triggers mtime --rerun-incomplete --scheduler greedy --jobs 6 --nolock --resources mem=957 mem_mb=980158 java_mem=813 (laca.py:105)
Config file config.yaml is extended by additional config specified via the command line.
Building DAG of jobs...
Pulling singularity image docker://genomicpariscentre/guppy:3.3.3.
Your conda installation is not configured to use strict channel priorities. This is however crucial for having robust and correct environments (for details, see https://conda-forge.org/docs/user/tipsandtricks.html). Please consider to configure strict priorities by executing 'conda config --set channel_priority strict'.
Creating conda environment ../tmp/repo/laca/workflow/envs/mmseqs2.yaml...
Downloading and installing remote packages.
Environment for /tmp/repo/laca/workflow/rules/../envs/mmseqs2.yaml created (location: tmp_db/conda_envs/9f7ce0c287dd42c65e27d60a5610c12a_)
Creating conda environment ../tmp/repo/laca/workflow/envs/cutadapt.yaml...
Downloading and installing remote packages.
Environment for /tmp/repo/laca/workflow/rules/../envs/cutadapt.yaml created (location: tmp_db/conda_envs/6ff289ee8fcb2a9d8152199a50d587b9_)
Creating conda environment ../tmp/repo/laca/workflow/envs/isONcorCon.yaml...
Downloading and installing remote packages.
Environment for /tmp/repo/laca/workflow/rules/../envs/isONcorCon.yaml created (location: tmp_db/conda_envs/7724e5612be64a299e2f41b61b024020_)
Creating conda environment ../tmp/repo/laca/workflow/envs/q2plugs.yaml...
Downloading and installing remote packages.
Environment for /tmp/repo/laca/workflow/rules/../envs/q2plugs.yaml created (location: tmp_db/conda_envs/8adeba28bea1ec0ded02293db943f5e2_)
Using shell: /usr/bin/bash
Provided cores: 6
Rules claiming more threads will be scaled down.
Provided resources: mem=957, mem_mb=980158, java_mem=813
Job stats:
job count min threads max threads
--------------------- ------- ------------- -------------
all 1 1 1
check_primers_repseqs 1 2 2
cls_isONclust 1 1 1
cls_kmerCon 1 1 1
cls_meshclust 1 1 1
col_q2blast_batch 1 1 1
collect_consensus 1 1 1
combine_cls 1 1 1
combine_fastq 1 1 1
count_matrix 1 1 1
demux_check 1 1 1
drep_consensus 1 6 6
exclude_empty_fqs 1 1 1
get_taxonomy 1 1 1
get_tree 1 1 1
guppy 1 6 6
isONclust 1 6 6
matrix_seqid 1 1 1
q2_fasttree 1 6 6
q2_repseqs 1 1 1
q2export_tree 1 1 1
rename_drep_seqs 1 1 1
repseqs_split 1 1 1
total 23 1 6
Select jobs to execute...
[Thu Feb 1 10:35:19 2024]
localrule guppy:
output: demux_guppy
log: logs/demultiplex/guppy.log
jobid: 9
benchmark: benchmarks/demultiplex/guppy.txt
reason: Missing output files: demux_guppy
threads: 6
resources: tmpdir=/tmp, mem=50
Activating singularity image /home/tmp_db/singularity_envs/be79a9f6f5e87678ce46ad686c92cb19.simg
ONT Guppy barcoding software version 3.3.3+fa743a6
input path: /home/tmp_fastq
save path: demux_guppy
arrangement files: barcode_arrs_16S-GXO192.cfg
min. score front: 60
min. score rear: 60
Found 1 fastq files.
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
Done in 157921 ms.
[Thu Feb 1 10:37:57 2024]
Finished job 9.
1 of 23 steps (4%) done
Select jobs to execute...
[Thu Feb 1 10:37:57 2024]
localcheckpoint demux_check:
input: demux_guppy
output: demultiplexed
log: logs/demultiplex/check.log
jobid: 8
benchmark: benchmarks/demultiplex/check.txt
reason: Missing output files: demultiplexed; Input files updated by another job: demux_guppy
resources: tmpdir=/tmp
DAG of jobs will be updated after completion.
[Thu Feb 1 10:37:57 2024]
Finished job 8.
2 of 23 steps (9%) done
Creating conda environment ../tmp/repo/laca/workflow/envs/yacrd.yaml...
Downloading and installing remote packages.
Environment for /tmp/repo/laca/workflow/rules/../envs/yacrd.yaml created (location: tmp_db/conda_envs/2076c028367dc304ff8332a2dc20dc22_)
Select jobs to execute...
[Thu Feb 1 10:37:59 2024]
localrule collect_fastq:
input: demultiplexed/BRK13
output: qc/BRK13.fastq
log: logs/demultiplex/collect_fastq/BRK13.log
jobid: 40
benchmark: benchmarks/demultiplex/collect_fastq/BRK13.txt
reason: Missing output files: qc/BRK13.fastq
wildcards: barcode=BRK13
resources: tmpdir=/tmp
[Thu Feb 1 10:37:59 2024]
Finished job 40.
3 of 30 steps (10%) done
Select jobs to execute...
[Thu Feb 1 10:37:59 2024]
rule check_primers:
input: qc/BRK13.fastq
output: qc/primers_passed/BRK13F.fastq, qc/primers_unpassed/BRK13F.fastq
log: logs/qc/check_primersF/BRK13.log
jobid: 39
benchmark: benchmarks/qc/check_primersF/BRK13.txt
reason: Missing output files: qc/primers_unpassed/BRK13F.fastq, qc/primers_passed/BRK13F.fastq; Input files updated by another job: qc/BRK13.fastq
wildcards: barcode=BRK13
threads: 6
resources: tmpdir=/tmp, mem=10, time=1
Activating conda environment: tmp_db/conda_envs/6ff289ee8fcb2a9d8152199a50d587b9_
[Thu Feb 1 10:38:00 2024]
Finished job 39.
4 of 30 steps (13%) done
Removing temporary output qc/BRK13.fastq.
Select jobs to execute...
[Thu Feb 1 10:38:00 2024]
rule check_primersR:
input: qc/primers_unpassed/BRK13F.fastq
output: qc/primers_passed/BRK13R.fastq, qc/primers_unpassed/BRK13.fastq
log: logs/qc/check_primersR/BRK13.log
jobid: 41
benchmark: benchmarks/qc/check_primersR/BRK13.txt
reason: Missing output files: qc/primers_passed/BRK13R.fastq; Input files updated by another job: qc/primers_unpassed/BRK13F.fastq
wildcards: barcode=BRK13
threads: 6
resources: tmpdir=/tmp, mem=10, time=1
Activating conda environment: tmp_db/conda_envs/6ff289ee8fcb2a9d8152199a50d587b9_
[Thu Feb 1 10:38:01 2024]
Finished job 41.
5 of 30 steps (17%) done
Removing temporary output qc/primers_unpassed/BRK13F.fastq.
Removing temporary output qc/primers_unpassed/BRK13.fastq.
Select jobs to execute...
[Thu Feb 1 10:38:01 2024]
rule revcomp_fq_combine:
input: qc/primers_passed/BRK13F.fastq, qc/primers_passed/BRK13R.fastq
output: qc/primers_passed/BRK13R_revcomp.fastq, qc/primers_passed/BRK13.fastq
jobid: 38
reason: Missing output files: qc/primers_passed/BRK13.fastq; Input files updated by another job: qc/primers_passed/BRK13R.fastq, qc/primers_passed/BRK13F.fastq
wildcards: barcode=BRK13
threads: 2
resources: tmpdir=/tmp, mem=10, time=1
[Thu Feb 1 10:38:01 2024]
Finished job 38.
6 of 30 steps (20%) done
Removing temporary output qc/primers_passed/BRK13F.fastq.
Removing temporary output qc/primers_passed/BRK13R.fastq.
Removing temporary output qc/primers_passed/BRK13R_revcomp.fastq.
Select jobs to execute...
[Thu Feb 1 10:38:01 2024]
rule minimap2ava_yacrd:
input: qc/primers_passed/BRK13.fastq
output: qc/yacrd/BRK13.paf
log: logs/qc/yacrd/BRK13_ava.log
jobid: 42
benchmark: benchmarks/qc/yacrd/BRK13_ava.txt
reason: Missing output files: qc/yacrd/BRK13.paf; Input files updated by another job: qc/primers_passed/BRK13.fastq
wildcards: barcode=BRK13
threads: 6
resources: tmpdir=/tmp, mem=50, time=1
Activating conda environment: tmp_db/conda_envs/2076c028367dc304ff8332a2dc20dc22_
[Thu Feb 1 10:38:05 2024]
Finished job 42.
7 of 30 steps (23%) done
Select jobs to execute...
[Thu Feb 1 10:38:05 2024]
rule yacrd:
input: qc/primers_passed/BRK13.fastq, qc/yacrd/BRK13.paf
output: qc/yacrd/BRK13.fastq
log: logs/qc/yacrd/BRK13_filter.log
jobid: 37
benchmark: benchmarks/qc/yacrd/BRK13_filter.txt
reason: Missing output files: qc/yacrd/BRK13.fastq; Input files updated by another job: qc/yacrd/BRK13.paf, qc/primers_passed/BRK13.fastq
wildcards: barcode=BRK13
threads: 6
resources: tmpdir=/tmp, mem=50, time=1
Activating conda environment: tmp_db/conda_envs/2076c028367dc304ff8332a2dc20dc22_
[Thu Feb 1 10:38:06 2024]
Finished job 37.
8 of 30 steps (27%) done
Removing temporary output qc/primers_passed/BRK13.fastq.
Removing temporary output qc/yacrd/BRK13.paf.
Select jobs to execute...
[Thu Feb 1 10:38:06 2024]
rule q_filter:
input: qc/yacrd/BRK13.fastq
output: qc/qfilt/BRK13.fastq
jobid: 36
reason: Missing output files: qc/qfilt/BRK13.fastq; Input files updated by another job: qc/yacrd/BRK13.fastq
wildcards: barcode=BRK13
threads: 2
resources: tmpdir=/tmp, mem=10, time=1
[Thu Feb 1 10:38:06 2024]
Finished job 36.
9 of 30 steps (30%) done
Removing temporary output qc/yacrd/BRK13.fastq.
Select jobs to execute...
[Thu Feb 1 10:38:06 2024]
localcheckpoint exclude_empty_fqs:
input: qc/qfilt/BRK13.fastq
output: .qc_DONE
jobid: 7
reason: Missing output files: .qc_DONE; Input files updated by another job: qc/qfilt/BRK13.fastq
resources: tmpdir=/tmp
DAG of jobs will be updated after completion.
Touching output file .qc_DONE.
[Thu Feb 1 10:38:06 2024]
Finished job 7.
10 of 30 steps (33%) done
Select jobs to execute...
[Thu Feb 1 10:38:06 2024]
localrule combine_fastq:
input: qc/qfilt/BRK13.fastq
output: qc/qfilt/pooled.fastq
jobid: 12
reason: Missing output files: qc/qfilt/pooled.fastq
resources: tmpdir=/tmp
[Thu Feb 1 10:38:06 2024]
Finished job 12.
11 of 30 steps (37%) done
Select jobs to execute...
[Thu Feb 1 10:38:06 2024]
rule isONclust:
input: qc/qfilt/pooled.fastq
output: clust/isONclust/pooled, clust/isONclust/pooled.tsv
log: logs/clust/isONclust/pooled.log
jobid: 11
benchmark: benchmarks/clust/isONclust/pooled.txt
reason: Missing output files: clust/isONclust/pooled.tsv; Input files updated by another job: qc/qfilt/pooled.fastq
wildcards: barcode=pooled
threads: 6
resources: tmpdir=/tmp, mem=50, time=30
Activating conda environment: tmp_db/conda_envs/7724e5612be64a299e2f41b61b024020_
[Thu Feb 1 10:38:08 2024]
Finished job 11.
12 of 30 steps (40%) done
Removing temporary output clust/isONclust/pooled.
Select jobs to execute...
[Thu Feb 1 10:38:08 2024]
localcheckpoint cls_isONclust:
input: qc/qfilt/BRK13.fastq, clust/isONclust/pooled.tsv
output: clust/isONclust/read2cluster
jobid: 10
reason: Missing output files: clust/isONclust/read2cluster; Input files updated by another job: clust/isONclust/pooled.tsv, qc/qfilt/BRK13.fastq
resources: tmpdir=/tmp
DAG of jobs will be updated after completion.
Config file config.yaml is extended by additional config specified via the command line.
Building DAG of jobs...
Your conda installation is not configured to use strict channel priorities. This is however crucial for having robust and correct environments (for details, see https://conda-forge.org/docs/user/tipsandtricks.html). Please consider to configure strict priorities by executing 'conda config --set channel_priority strict'.
Using shell: /usr/bin/bash
Provided cores: 6
Rules claiming more threads will be scaled down.
Select jobs to execute...
[Thu Feb 1 10:38:09 2024]
Finished job 10.
13 of 30 steps (43%) done
Pulling singularity image docker://yanhui09/identity:latest.
Select jobs to execute...
[Thu Feb 1 10:38:52 2024]
localrule fqs_split_isONclust:
input: clust/isONclust/read2cluster/pooled_1.csv, qc/qfilt/pooled.fastq
output: clust/isONclust/pooled_1.split.fastq
jobid: 65
reason: Missing output files: clust/isONclust/pooled_1.split.fastq
wildcards: barcode=pooled, c1=1
resources: tmpdir=/tmp
[Thu Feb 1 10:38:52 2024]
localrule fqs_split_isONclust:
input: clust/isONclust/read2cluster/pooled_2.csv, qc/qfilt/pooled.fastq
output: clust/isONclust/pooled_2.split.fastq
jobid: 69
reason: Missing output files: clust/isONclust/pooled_2.split.fastq
wildcards: barcode=pooled, c1=2
resources: tmpdir=/tmp
[Thu Feb 1 10:38:52 2024]
localrule fqs_split_isONclust:
input: clust/isONclust/read2cluster/pooled_0.csv, qc/qfilt/pooled.fastq
output: clust/isONclust/pooled_0.split.fastq
jobid: 61
reason: Missing output files: clust/isONclust/pooled_0.split.fastq
wildcards: barcode=pooled, c1=0
resources: tmpdir=/tmp
[Thu Feb 1 10:38:52 2024]
Finished job 69.
14 of 42 steps (33%) done
Select jobs to execute...
[Thu Feb 1 10:38:52 2024]
localrule fqs_split_isONclust2:
input: clust/isONclust/pooled_2.split.fastq
output: clust/isONclust/split/pooled_2_0.fastq
jobid: 68
reason: Missing output files: clust/isONclust/split/pooled_2_0.fastq; Input files updated by another job: clust/isONclust/pooled_2.split.fastq
wildcards: barcode=pooled, c1=2
resources: tmpdir=/tmp
[Thu Feb 1 10:38:52 2024]
Finished job 65.
15 of 42 steps (36%) done
Select jobs to execute...
[Thu Feb 1 10:38:52 2024]
localrule fqs_split_isONclust2:
input: clust/isONclust/pooled_1.split.fastq
output: clust/isONclust/split/pooled_1_0.fastq
jobid: 64
reason: Missing output files: clust/isONclust/split/pooled_1_0.fastq; Input files updated by another job: clust/isONclust/pooled_1.split.fastq
wildcards: barcode=pooled, c1=1
resources: tmpdir=/tmp
[Thu Feb 1 10:38:52 2024]
Finished job 68.
16 of 42 steps (38%) done
Removing temporary output clust/isONclust/pooled_2.split.fastq.
Select jobs to execute...
[Thu Feb 1 10:38:52 2024]
localrule fq2fa4meshclust:
input: clust/isONclust/split/pooled_2_0.fastq
output: clust/meshclust/pooled_2_0.fasta
jobid: 67
reason: Missing output files: clust/meshclust/pooled_2_0.fasta; Input files updated by another job: clust/isONclust/split/pooled_2_0.fastq
wildcards: barcode=pooled, c1=2, c2=0
resources: tmpdir=/tmp
[Thu Feb 1 10:38:52 2024]
Finished job 64.
17 of 42 steps (40%) done
Removing temporary output clust/isONclust/pooled_1.split.fastq.
Select jobs to execute...
[Thu Feb 1 10:38:52 2024]
localrule fq2fa4meshclust:
input: clust/isONclust/split/pooled_1_0.fastq
output: clust/meshclust/pooled_1_0.fasta
jobid: 63
reason: Missing output files: clust/meshclust/pooled_1_0.fasta; Input files updated by another job: clust/isONclust/split/pooled_1_0.fastq
wildcards: barcode=pooled, c1=1, c2=0
resources: tmpdir=/tmp
[Thu Feb 1 10:38:52 2024]
Finished job 61.
18 of 42 steps (43%) done
Select jobs to execute...
[Thu Feb 1 10:38:52 2024]
localrule fqs_split_isONclust2:
input: clust/isONclust/pooled_0.split.fastq
output: clust/isONclust/split/pooled_0_0.fastq
jobid: 60
reason: Missing output files: clust/isONclust/split/pooled_0_0.fastq; Input files updated by another job: clust/isONclust/pooled_0.split.fastq
wildcards: barcode=pooled, c1=0
resources: tmpdir=/tmp
[Thu Feb 1 10:38:52 2024]
Finished job 67.
19 of 42 steps (45%) done
Removing temporary output clust/isONclust/split/pooled_2_0.fastq.
Select jobs to execute...
[Thu Feb 1 10:38:52 2024]
Finished job 60.
20 of 42 steps (48%) done
Removing temporary output clust/isONclust/pooled_0.split.fastq.
[Thu Feb 1 10:38:52 2024]
localrule fq2fa4meshclust:
input: clust/isONclust/split/pooled_0_0.fastq
output: clust/meshclust/pooled_0_0.fasta
jobid: 59
reason: Missing output files: clust/meshclust/pooled_0_0.fasta; Input files updated by another job: clust/isONclust/split/pooled_0_0.fastq
wildcards: barcode=pooled, c1=0, c2=0
resources: tmpdir=/tmp
[Thu Feb 1 10:38:52 2024]
Finished job 63.
21 of 42 steps (50%) done
Removing temporary output clust/isONclust/split/pooled_1_0.fastq.
Select jobs to execute...
[Thu Feb 1 10:38:52 2024]
Finished job 59.
22 of 42 steps (52%) done
Removing temporary output clust/isONclust/split/pooled_0_0.fastq.
[Thu Feb 1 10:38:52 2024]
rule meshclust:
input: clust/meshclust/pooled_0_0.fasta
output: clust/meshclust/pooled_0_0.tsv
log: logs/clust/meshclust/pooled_0_0.log
jobid: 58
benchmark: benchmarks/mechclust/pooled_0_0.txt
reason: Missing output files: clust/meshclust/pooled_0_0.tsv; Input files updated by another job: clust/meshclust/pooled_0_0.fasta
wildcards: barcode=pooled, c1=0, c2=0
threads: 6
resources: tmpdir=/tmp, mem=50, time=30
Activating singularity image /home/tmp_db/singularity_envs/1f35c0780be107844efa2707b35541ca.simg
[Thu Feb 1 10:39:07 2024]
Finished job 58.
23 of 42 steps (55%) done
Removing temporary output clust/meshclust/pooled_0_0.fasta.
Select jobs to execute...
[Thu Feb 1 10:39:07 2024]
rule meshclust:
input: clust/meshclust/pooled_1_0.fasta
output: clust/meshclust/pooled_1_0.tsv
log: logs/clust/meshclust/pooled_1_0.log
jobid: 62
benchmark: benchmarks/mechclust/pooled_1_0.txt
reason: Missing output files: clust/meshclust/pooled_1_0.tsv; Input files updated by another job: clust/meshclust/pooled_1_0.fasta
wildcards: barcode=pooled, c1=1, c2=0
threads: 6
resources: tmpdir=/tmp, mem=50, time=30
Activating singularity image /home/tmp_db/singularity_envs/1f35c0780be107844efa2707b35541ca.simg
[Thu Feb 1 10:39:16 2024]
Finished job 62.
24 of 42 steps (57%) done
Removing temporary output clust/meshclust/pooled_1_0.fasta.
Select jobs to execute...
[Thu Feb 1 10:39:16 2024]
rule meshclust:
input: clust/meshclust/pooled_2_0.fasta
output: clust/meshclust/pooled_2_0.tsv
log: logs/clust/meshclust/pooled_2_0.log
jobid: 66
benchmark: benchmarks/mechclust/pooled_2_0.txt
reason: Missing output files: clust/meshclust/pooled_2_0.tsv; Input files updated by another job: clust/meshclust/pooled_2_0.fasta
wildcards: barcode=pooled, c1=2, c2=0
threads: 6
resources: tmpdir=/tmp, mem=50, time=30
Activating singularity image /home/tmp_db/singularity_envs/1f35c0780be107844efa2707b35541ca.simg
[Thu Feb 1 10:39:21 2024]
Finished job 66.
25 of 42 steps (60%) done
Removing temporary output clust/meshclust/pooled_2_0.fasta.
Select jobs to execute...
[Thu Feb 1 10:39:21 2024]
localcheckpoint cls_meshclust:
input: .qc_DONE, qc/qfilt/BRK13.fastq, clust/meshclust/pooled_0_0.tsv, clust/meshclust/pooled_1_0.tsv, clust/meshclust/pooled_2_0.tsv
output: clust/clusters
jobid: 6
reason: Missing output files: clust/clusters; Input files updated by another job: clust/meshclust/pooled_0_0.tsv, clust/meshclust/pooled_2_0.tsv, clust/meshclust/pooled_1_0.tsv
resources: tmpdir=/tmp
DAG of jobs will be updated after completion.
Config file config.yaml is extended by additional config specified via the command line.
Building DAG of jobs...
Your conda installation is not configured to use strict channel priorities. This is however crucial for having robust and correct environments (for details, see https://conda-forge.org/docs/user/tipsandtricks.html). Please consider to configure strict priorities by executing 'conda config --set channel_priority strict'.
Using shell: /usr/bin/bash
Provided cores: 6
Rules claiming more threads will be scaled down.
Select jobs to execute...
[Thu Feb 1 10:39:23 2024]
Finished job 6.
26 of 42 steps (62%) done
Select jobs to execute...
[Thu Feb 1 10:39:23 2024]
localrule fqs_split_meshclust:
input: clust/clusters/pooled_0_0_1.csv, clust/clusters/pooled_0_0_1.centroid, qc/qfilt/pooled.fastq
output: clust/members/pooled_0_0_1.fastq, clust/centroids/pooled_0_0_1.fasta
jobid: 77
reason: Missing output files: clust/members/pooled_0_0_1.fastq, clust/centroids/pooled_0_0_1.fasta
wildcards: barcode=pooled, c1=0, c2=0, c3=1
resources: tmpdir=/tmp
[Thu Feb 1 10:39:23 2024]
localrule fqs_split_meshclust:
input: clust/clusters/pooled_1_0_1.csv, clust/clusters/pooled_1_0_1.centroid, qc/qfilt/pooled.fastq
output: clust/members/pooled_1_0_1.fastq, clust/centroids/pooled_1_0_1.fasta
jobid: 76
reason: Missing output files: clust/members/pooled_1_0_1.fastq, clust/centroids/pooled_1_0_1.fasta
wildcards: barcode=pooled, c1=1, c2=0, c3=1
resources: tmpdir=/tmp
[Thu Feb 1 10:39:23 2024]
localrule fqs_split_meshclust:
input: clust/clusters/pooled_1_0_2.csv, clust/clusters/pooled_1_0_2.centroid, qc/qfilt/pooled.fastq
output: clust/members/pooled_1_0_2.fastq, clust/centroids/pooled_1_0_2.fasta
jobid: 74
reason: Missing output files: clust/members/pooled_1_0_2.fastq, clust/centroids/pooled_1_0_2.fasta
wildcards: barcode=pooled, c1=1, c2=0, c3=2
resources: tmpdir=/tmp
[Thu Feb 1 10:39:23 2024]
localrule fqs_split_meshclust:
input: clust/clusters/pooled_2_0_1.csv, clust/clusters/pooled_2_0_1.centroid, qc/qfilt/pooled.fastq
output: clust/members/pooled_2_0_1.fastq, clust/centroids/pooled_2_0_1.fasta
jobid: 75
reason: Missing output files: clust/centroids/pooled_2_0_1.fasta, clust/members/pooled_2_0_1.fastq
wildcards: barcode=pooled, c1=2, c2=0, c3=1
resources: tmpdir=/tmp
[Thu Feb 1 10:39:23 2024]
Finished job 76.
27 of 50 steps (54%) done
Select jobs to execute...
[Thu Feb 1 10:39:23 2024]
localrule fqs_split_kmerCon:
input: clust/members/pooled_1_0_1.fastq, clust/centroids/pooled_1_0_1.fasta
output: kmerCon/split/pooled_1_0_1cand1.fastq, kmerCon/polish/pooled_1_0_1cand1/minimap2/raw.fna
jobid: 80
reason: Missing output files: kmerCon/polish/pooled_1_0_1cand1/minimap2/raw.fna, kmerCon/split/pooled_1_0_1cand1.fastq; Input files updated by another job: clust/members/pooled_1_0_1.fastq, clust/centroids/pooled_1_0_1.fasta
wildcards: barcode=pooled, c1=1, c2=0, c3=1
resources: tmpdir=/tmp
[Thu Feb 1 10:39:23 2024]
Finished job 80.
28 of 50 steps (56%) done
Removing temporary output clust/centroids/pooled_1_0_1.fasta.
[Thu Feb 1 10:39:23 2024]
Finished job 77.
29 of 50 steps (58%) done
Select jobs to execute...
[Thu Feb 1 10:39:23 2024]
localrule fqs_split_kmerCon:
input: clust/members/pooled_0_0_1.fastq, clust/centroids/pooled_0_0_1.fasta
output: kmerCon/split/pooled_0_0_1cand1.fastq, kmerCon/polish/pooled_0_0_1cand1/minimap2/raw.fna
jobid: 81
reason: Missing output files: kmerCon/polish/pooled_0_0_1cand1/minimap2/raw.fna, kmerCon/split/pooled_0_0_1cand1.fastq; Input files updated by another job: clust/members/pooled_0_0_1.fastq, clust/centroids/pooled_0_0_1.fasta
wildcards: barcode=pooled, c1=0, c2=0, c3=1
resources: tmpdir=/tmp
[Thu Feb 1 10:39:23 2024]
Finished job 81.
30 of 50 steps (60%) done
Removing temporary output clust/centroids/pooled_0_0_1.fasta.
[Thu Feb 1 10:39:23 2024]
Finished job 75.
31 of 50 steps (62%) done
Select jobs to execute...
[Thu Feb 1 10:39:23 2024]
localrule fqs_split_kmerCon:
input: clust/members/pooled_2_0_1.fastq, clust/centroids/pooled_2_0_1.fasta
output: kmerCon/split/pooled_2_0_1cand1.fastq, kmerCon/polish/pooled_2_0_1cand1/minimap2/raw.fna
jobid: 79
reason: Missing output files: kmerCon/polish/pooled_2_0_1cand1/minimap2/raw.fna, kmerCon/split/pooled_2_0_1cand1.fastq; Input files updated by another job: clust/centroids/pooled_2_0_1.fasta, clust/members/pooled_2_0_1.fastq
wildcards: barcode=pooled, c1=2, c2=0, c3=1
resources: tmpdir=/tmp
[Thu Feb 1 10:39:23 2024]
Finished job 74.
32 of 50 steps (64%) done
Select jobs to execute...
[Thu Feb 1 10:39:23 2024]
localrule fqs_split_kmerCon:
input: clust/members/pooled_1_0_2.fastq, clust/centroids/pooled_1_0_2.fasta
output: kmerCon/split/pooled_1_0_2cand1.fastq, kmerCon/polish/pooled_1_0_2cand1/minimap2/raw.fna
jobid: 78
reason: Missing output files: kmerCon/split/pooled_1_0_2cand1.fastq, kmerCon/polish/pooled_1_0_2cand1/minimap2/raw.fna; Input files updated by another job: clust/members/pooled_1_0_2.fastq, clust/centroids/pooled_1_0_2.fasta
wildcards: barcode=pooled, c1=1, c2=0, c3=2
resources: tmpdir=/tmp
[Thu Feb 1 10:39:23 2024]
Finished job 79.
33 of 50 steps (66%) done
Removing temporary output clust/centroids/pooled_2_0_1.fasta.
[Thu Feb 1 10:39:23 2024]
Finished job 78.
34 of 50 steps (68%) done
Removing temporary output clust/centroids/pooled_1_0_2.fasta.
Select jobs to execute...
[Thu Feb 1 10:39:23 2024]
checkpoint cls_kmerCon:
input: clust/clusters, .qc_DONE, qc/qfilt/BRK13.fastq, clust/members/pooled_1_0_2.fastq, clust/members/pooled_2_0_1.fastq, clust/members/pooled_1_0_1.fastq, clust/members/pooled_0_0_1.fastq, kmerCon/split/pooled_1_0_2cand1.fastq, kmerCon/split/pooled_2_0_1cand1.fastq, kmerCon/split/pooled_1_0_1cand1.fastq, kmerCon/split/pooled_0_0_1cand1.fastq, kmerCon/polish/pooled_1_0_2cand1/minimap2/raw.fna, kmerCon/polish/pooled_2_0_1cand1/minimap2/raw.fna, kmerCon/polish/pooled_1_0_1cand1/minimap2/raw.fna, kmerCon/polish/pooled_0_0_1cand1/minimap2/raw.fna, clust/clusters/pooled_1_0_2.csv, clust/clusters/pooled_2_0_1.csv, clust/clusters/pooled_1_0_1.csv, clust/clusters/pooled_0_0_1.csv
output: kmerCon/clusters
jobid: 5
reason: Missing output files: kmerCon/clusters; Input files updated by another job: kmerCon/split/pooled_1_0_2cand1.fastq, clust/members/pooled_1_0_1.fastq, clust/members/pooled_0_0_1.fastq, kmerCon/polish/pooled_0_0_1cand1/minimap2/raw.fna, kmerCon/split/pooled_1_0_1cand1.fastq, clust/members/pooled_2_0_1.fastq, kmerCon/polish/pooled_1_0_2cand1/minimap2/raw.fna, kmerCon/polish/pooled_2_0_1cand1/minimap2/raw.fna, kmerCon/polish/pooled_1_0_1cand1/minimap2/raw.fna, clust/members/pooled_1_0_2.fastq, kmerCon/split/pooled_0_0_1cand1.fastq, kmerCon/split/pooled_2_0_1cand1.fastq
resources: tmpdir=/tmp
DAG of jobs will be updated after completion.
Config file config.yaml is extended by additional config specified via the command line.
Building DAG of jobs...
Your conda installation is not configured to use strict channel priorities. This is however crucial for having robust and correct environments (for details, see https://conda-forge.org/docs/user/tipsandtricks.html). Please consider to configure strict priorities by executing 'conda config --set channel_priority strict'.
Using shell: /usr/bin/bash
Provided cores: 6
Rules claiming more threads will be scaled down.
Select jobs to execute...
[Thu Feb 1 10:39:24 2024]
Finished job 5.
35 of 50 steps (70%) done
Creating conda environment ../tmp/repo/laca/workflow/envs/racon.yaml...
Downloading and installing remote packages.
Environment for /tmp/repo/laca/workflow/rules/../envs/racon.yaml created (location: tmp_db/conda_envs/d9e9e3d72b467b116a163d017358d209_)
Creating conda environment ../tmp/repo/laca/workflow/envs/medaka.yaml...
Downloading and installing remote packages.
Environment for /tmp/repo/laca/workflow/rules/../envs/medaka.yaml created (location: tmp_db/conda_envs/bde525cfaffd6da24da878a8a66aa8f3_)
Creating conda environment ../tmp/repo/laca/workflow/envs/minimap2.yaml...
Downloading and installing remote packages.
Environment for /tmp/repo/laca/workflow/rules/../envs/minimap2.yaml created (location: tmp_db/conda_envs/51f154299f902025c37a9ddea36e9595_)
Removing temporary output clust/members/pooled_1_0_2.fastq.
Removing temporary output clust/members/pooled_2_0_1.fastq.
Removing temporary output clust/members/pooled_1_0_1.fastq.
Removing temporary output clust/members/pooled_0_0_1.fastq.
Select jobs to execute...
[Thu Feb 1 10:54:03 2024]
rule minimap2polish:
input: kmerCon/polish/pooled_0_0_1cand1/minimap2/raw.fna, kmerCon/split/pooled_0_0_1cand1.fastq
output: kmerCon/polish/pooled_0_0_1cand1/minimap2/raw.paf
log: logs/kmerCon/pooled_0_0_1cand1/minimap2_raw.log
jobid: 95
benchmark: benchmarks/kmerCon/pooled_0_0_1cand1/minimap2_raw.txt
reason: Missing output files: kmerCon/polish/pooled_0_0_1cand1/minimap2/raw.paf
wildcards: consensus=kmerCon, bc_cls_cand=pooled_0_0_1cand1, assembly=raw
threads: 2
resources: tmpdir=/tmp, mem=10, time=1
[Thu Feb 1 10:54:03 2024]
rule minimap2polish:
input: kmerCon/polish/pooled_2_0_1cand1/minimap2/raw.fna, kmerCon/split/pooled_2_0_1cand1.fastq
output: kmerCon/polish/pooled_2_0_1cand1/minimap2/raw.paf
log: logs/kmerCon/pooled_2_0_1cand1/minimap2_raw.log
jobid: 105
benchmark: benchmarks/kmerCon/pooled_2_0_1cand1/minimap2_raw.txt
reason: Missing output files: kmerCon/polish/pooled_2_0_1cand1/minimap2/raw.paf
wildcards: consensus=kmerCon, bc_cls_cand=pooled_2_0_1cand1, assembly=raw
threads: 2
resources: tmpdir=/tmp, mem=10, time=1
Activating conda environment: tmp_db/conda_envs/51f154299f902025c37a9ddea36e9595_
Activating conda environment: tmp_db/conda_envs/51f154299f902025c37a9ddea36e9595_
[Thu Feb 1 10:54:03 2024]
rule minimap2polish:
input: kmerCon/polish/pooled_1_0_1cand1/minimap2/raw.fna, kmerCon/split/pooled_1_0_1cand1.fastq
output: kmerCon/polish/pooled_1_0_1cand1/minimap2/raw.paf
log: logs/kmerCon/pooled_1_0_1cand1/minimap2_raw.log
jobid: 100
benchmark: benchmarks/kmerCon/pooled_1_0_1cand1/minimap2_raw.txt
reason: Missing output files: kmerCon/polish/pooled_1_0_1cand1/minimap2/raw.paf
wildcards: consensus=kmerCon, bc_cls_cand=pooled_1_0_1cand1, assembly=raw
threads: 2
resources: tmpdir=/tmp, mem=10, time=1
Activating conda environment: tmp_db/conda_envs/51f154299f902025c37a9ddea36e9595_
[Thu Feb 1 10:54:04 2024]
Finished job 105.
36 of 70 steps (51%) done
Select jobs to execute...
[Thu Feb 1 10:54:04 2024]
rule racon:
input: kmerCon/split/pooled_2_0_1cand1.fastq, kmerCon/polish/pooled_2_0_1cand1/minimap2/raw.paf, kmerCon/polish/pooled_2_0_1cand1/minimap2/raw.fna
output: kmerCon/polish/pooled_2_0_1cand1/minimap2/racon_1.fna
log: logs/kmerCon/pooled_2_0_1cand1/racon_1.log
jobid: 104
benchmark: benchmarks/kmerCon/pooled_2_0_1cand1/racon_1.txt
reason: Missing output files: kmerCon/polish/pooled_2_0_1cand1/minimap2/racon_1.fna; Input files updated by another job: kmerCon/polish/pooled_2_0_1cand1/minimap2/raw.paf
wildcards: consensus=kmerCon, bc_cls_cand=pooled_2_0_1cand1, iter=1
threads: 2
resources: tmpdir=/tmp, mem=10, time=1
Activating conda environment: tmp_db/conda_envs/d9e9e3d72b467b116a163d017358d209_
[Thu Feb 1 10:54:04 2024]
Finished job 100.
37 of 70 steps (53%) done
Select jobs to execute...
[Thu Feb 1 10:54:04 2024]
rule racon:
input: kmerCon/split/pooled_1_0_1cand1.fastq, kmerCon/polish/pooled_1_0_1cand1/minimap2/raw.paf, kmerCon/polish/pooled_1_0_1cand1/minimap2/raw.fna
output: kmerCon/polish/pooled_1_0_1cand1/minimap2/racon_1.fna
log: logs/kmerCon/pooled_1_0_1cand1/racon_1.log
jobid: 99
benchmark: benchmarks/kmerCon/pooled_1_0_1cand1/racon_1.txt
reason: Missing output files: kmerCon/polish/pooled_1_0_1cand1/minimap2/racon_1.fna; Input files updated by another job: kmerCon/polish/pooled_1_0_1cand1/minimap2/raw.paf
wildcards: consensus=kmerCon, bc_cls_cand=pooled_1_0_1cand1, iter=1
threads: 2
resources: tmpdir=/tmp, mem=10, time=1
Activating conda environment: tmp_db/conda_envs/d9e9e3d72b467b116a163d017358d209_
[Thu Feb 1 10:54:04 2024]
Finished job 95.
38 of 70 steps (54%) done
Select jobs to execute...
[Thu Feb 1 10:54:04 2024]
rule racon:
input: kmerCon/split/pooled_0_0_1cand1.fastq, kmerCon/polish/pooled_0_0_1cand1/minimap2/raw.paf, kmerCon/polish/pooled_0_0_1cand1/minimap2/raw.fna
output: kmerCon/polish/pooled_0_0_1cand1/minimap2/racon_1.fna
log: logs/kmerCon/pooled_0_0_1cand1/racon_1.log
jobid: 94
benchmark: benchmarks/kmerCon/pooled_0_0_1cand1/racon_1.txt
reason: Missing output files: kmerCon/polish/pooled_0_0_1cand1/minimap2/racon_1.fna; Input files updated by another job: kmerCon/polish/pooled_0_0_1cand1/minimap2/raw.paf
wildcards: consensus=kmerCon, bc_cls_cand=pooled_0_0_1cand1, iter=1
threads: 2
resources: tmpdir=/tmp, mem=10, time=1
Activating conda environment: tmp_db/conda_envs/d9e9e3d72b467b116a163d017358d209_
[Thu Feb 1 10:54:04 2024]
Finished job 104.
39 of 70 steps (56%) done
Removing temporary output kmerCon/polish/pooled_2_0_1cand1/minimap2/raw.fna.
Removing temporary output kmerCon/polish/pooled_2_0_1cand1/minimap2/raw.paf.
Select jobs to execute...
[Thu Feb 1 10:54:04 2024]
rule minimap2polish:
input: kmerCon/polish/pooled_2_0_1cand1/minimap2/racon_1.fna, kmerCon/split/pooled_2_0_1cand1.fastq
output: kmerCon/polish/pooled_2_0_1cand1/minimap2/racon_1.paf
log: logs/kmerCon/pooled_2_0_1cand1/minimap2_racon_1.log
jobid: 103
benchmark: benchmarks/kmerCon/pooled_2_0_1cand1/minimap2_racon_1.txt
reason: Missing output files: kmerCon/polish/pooled_2_0_1cand1/minimap2/racon_1.paf; Input files updated by another job: kmerCon/polish/pooled_2_0_1cand1/minimap2/racon_1.fna
wildcards: consensus=kmerCon, bc_cls_cand=pooled_2_0_1cand1, assembly=racon_1
threads: 2
resources: tmpdir=/tmp, mem=10, time=1
Activating conda environment: tmp_db/conda_envs/51f154299f902025c37a9ddea36e9595_
[Thu Feb 1 10:54:04 2024]
Finished job 103.
40 of 70 steps (57%) done
Select jobs to execute...
[Thu Feb 1 10:54:04 2024]
rule racon:
input: kmerCon/split/pooled_2_0_1cand1.fastq, kmerCon/polish/pooled_2_0_1cand1/minimap2/racon_1.paf, kmerCon/polish/pooled_2_0_1cand1/minimap2/racon_1.fna
output: kmerCon/polish/pooled_2_0_1cand1/minimap2/racon_2.fna
log: logs/kmerCon/pooled_2_0_1cand1/racon_2.log
jobid: 102
benchmark: benchmarks/kmerCon/pooled_2_0_1cand1/racon_2.txt
reason: Missing output files: kmerCon/polish/pooled_2_0_1cand1/minimap2/racon_2.fna; Input files updated by another job: kmerCon/polish/pooled_2_0_1cand1/minimap2/racon_1.fna, kmerCon/polish/pooled_2_0_1cand1/minimap2/racon_1.paf
wildcards: consensus=kmerCon, bc_cls_cand=pooled_2_0_1cand1, iter=2
threads: 2
resources: tmpdir=/tmp, mem=10, time=1
Activating conda environment: tmp_db/conda_envs/d9e9e3d72b467b116a163d017358d209_
[Thu Feb 1 10:54:04 2024]
Finished job 99.
41 of 70 steps (59%) done
Removing temporary output kmerCon/polish/pooled_1_0_1cand1/minimap2/raw.fna.
Removing temporary output kmerCon/polish/pooled_1_0_1cand1/minimap2/raw.paf.
Select jobs to execute...
[Thu Feb 1 10:54:04 2024]
rule minimap2polish:
input: kmerCon/polish/pooled_1_0_1cand1/minimap2/racon_1.fna, kmerCon/split/pooled_1_0_1cand1.fastq
output: kmerCon/polish/pooled_1_0_1cand1/minimap2/racon_1.paf
log: logs/kmerCon/pooled_1_0_1cand1/minimap2_racon_1.log
jobid: 98
benchmark: benchmarks/kmerCon/pooled_1_0_1cand1/minimap2_racon_1.txt
reason: Missing output files: kmerCon/polish/pooled_1_0_1cand1/minimap2/racon_1.paf; Input files updated by another job: kmerCon/polish/pooled_1_0_1cand1/minimap2/racon_1.fna
wildcards: consensus=kmerCon, bc_cls_cand=pooled_1_0_1cand1, assembly=racon_1
threads: 2
resources: tmpdir=/tmp, mem=10, time=1
Activating conda environment: tmp_db/conda_envs/51f154299f902025c37a9ddea36e9595_
[Thu Feb 1 10:54:04 2024]
Finished job 102.
42 of 70 steps (60%) done
Removing temporary output kmerCon/polish/pooled_2_0_1cand1/minimap2/racon_1.paf.
Removing temporary output kmerCon/polish/pooled_2_0_1cand1/minimap2/racon_1.fna.
Select jobs to execute...
[Thu Feb 1 10:54:04 2024]
rule medaka_consensus:
input: kmerCon/polish/pooled_2_0_1cand1/minimap2/racon_2.fna, kmerCon/split/pooled_2_0_1cand1.fastq
output: kmerCon/polish/pooled_2_0_1cand1/medaka_1/consensus.fasta, kmerCon/polish/pooled_2_0_1cand1/medaka_1/consensus.fasta.gaps_in_draft_coords.bed, kmerCon/polish/pooled_2_0_1cand1/medaka_1/consensus_probs.hdf, kmerCon/polish/pooled_2_0_1cand1/medaka_1/calls_to_draft.bam, kmerCon/polish/pooled_2_0_1cand1/medaka_1/calls_to_draft.bam.bai
log: logs/kmerCon/pooled_2_0_1cand1/medaka_1.log
jobid: 101
benchmark: benchmarks/kmerCon/pooled_2_0_1cand1/medaka_1.txt
reason: Missing output files: kmerCon/polish/pooled_2_0_1cand1/medaka_1/consensus.fasta; Input files updated by another job: kmerCon/polish/pooled_2_0_1cand1/minimap2/racon_2.fna
wildcards: consensus=kmerCon, bc_cls_cand=pooled_2_0_1cand1, iter2=1
threads: 2
resources: tmpdir=/tmp, mem=10, time=1
Activating conda environment: tmp_db/conda_envs/bde525cfaffd6da24da878a8a66aa8f3_
[Thu Feb 1 10:54:04 2024]
Finished job 98.
43 of 70 steps (61%) done
Select jobs to execute...
[Thu Feb 1 10:54:04 2024]
rule racon:
input: kmerCon/split/pooled_1_0_1cand1.fastq, kmerCon/polish/pooled_1_0_1cand1/minimap2/racon_1.paf, kmerCon/polish/pooled_1_0_1cand1/minimap2/racon_1.fna
output: kmerCon/polish/pooled_1_0_1cand1/minimap2/racon_2.fna
log: logs/kmerCon/pooled_1_0_1cand1/racon_2.log
jobid: 97
benchmark: benchmarks/kmerCon/pooled_1_0_1cand1/racon_2.txt
reason: Missing output files: kmerCon/polish/pooled_1_0_1cand1/minimap2/racon_2.fna; Input files updated by another job: kmerCon/polish/pooled_1_0_1cand1/minimap2/racon_1.fna, kmerCon/polish/pooled_1_0_1cand1/minimap2/racon_1.paf
wildcards: consensus=kmerCon, bc_cls_cand=pooled_1_0_1cand1, iter=2
threads: 2
resources: tmpdir=/tmp, mem=10, time=1
Activating conda environment: tmp_db/conda_envs/d9e9e3d72b467b116a163d017358d209_
/usr/bin/bash: line 7: LD_LIBRARY_PATH: unbound variable
[Thu Feb 1 10:54:04 2024]
Error in rule medaka_consensus:
jobid: 101
input: kmerCon/polish/pooled_2_0_1cand1/minimap2/racon_2.fna, kmerCon/split/pooled_2_0_1cand1.fastq
output: kmerCon/polish/pooled_2_0_1cand1/medaka_1/consensus.fasta, kmerCon/polish/pooled_2_0_1cand1/medaka_1/consensus.fasta.gaps_in_draft_coords.bed, kmerCon/polish/pooled_2_0_1cand1/medaka_1/consensus_probs.hdf, kmerCon/polish/pooled_2_0_1cand1/medaka_1/calls_to_draft.bam, kmerCon/polish/pooled_2_0_1cand1/medaka_1/calls_to_draft.bam.bai
log: logs/kmerCon/pooled_2_0_1cand1/medaka_1.log (check log file(s) for error details)
conda-env: /home/tmp_db/conda_envs/bde525cfaffd6da24da878a8a66aa8f3_
shell:
# if fna file is empty, make dummy output
if [ ! -s kmerCon/polish/pooled_2_0_1cand1/minimap2/racon_2.fna ]; then
mkdir -p kmerCon/polish/pooled_2_0_1cand1/medaka_1 2> logs/kmerCon/pooled_2_0_1cand1/medaka_1.log
touch kmerCon/polish/pooled_2_0_1cand1/medaka_1/consensus.fasta kmerCon/polish/pooled_2_0_1cand1/medaka_1/consensus.fasta.gaps_in_draft_coords.bed kmerCon/polish/pooled_2_0_1cand1/medaka_1/consensus_probs.hdf kmerCon/polish/pooled_2_0_1cand1/medaka_1/calls_to_draft.bam kmerCon/polish/pooled_2_0_1cand1/medaka_1/calls_to_draft.bam.bai 2>> logs/kmerCon/pooled_2_0_1cand1/medaka_1.log
else
export OLD_LD_LIBRARY_PATH=${LD_LIBRARY_PATH}
export LD_LIBRARY_PATH="$CONDA_PREFIX/lib":${LD_LIBRARY_PATH}
export TF_CPP_MIN_LOG_LEVEL='2'
export CUDA_VISIBLE_DEVICES=""
medaka_consensus -i kmerCon/split/pooled_2_0_1cand1.fastq -d kmerCon/polish/pooled_2_0_1cand1/minimap2/racon_2.fna -o kmerCon/polish/pooled_2_0_1cand1/medaka_1 -t 2 -m r941_min_hac_g507 > logs/kmerCon/pooled_2_0_1cand1/medaka_1.log 2>&1
rm -f kmerCon/polish/pooled_2_0_1cand1/minimap2/racon_2.fna.fai kmerCon/polish/pooled_2_0_1cand1/minimap2/racon_2.fna.map-ont.mmi
export LD_LIBRARY_PATH=${OLD_LD_LIBRARY_PATH}
unset OLD_LD_LIBRARY_PATH
fi
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
[Thu Feb 1 10:54:05 2024]
Finished job 94.
44 of 70 steps (63%) done
Removing temporary output kmerCon/polish/pooled_0_0_1cand1/minimap2/raw.fna.
Removing temporary output kmerCon/polish/pooled_0_0_1cand1/minimap2/raw.paf.
[Thu Feb 1 10:54:05 2024]
Finished job 97.
45 of 70 steps (64%) done
Removing temporary output kmerCon/polish/pooled_1_0_1cand1/minimap2/racon_1.paf.
Removing temporary output kmerCon/polish/pooled_1_0_1cand1/minimap2/racon_1.fna.
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2024-02-01T103338.686815.snakemake.log
2024-02-01 10:54:05,378 - root - CRITICAL - Command 'snakemake all --directory '/home' --snakefile '/tmp/repo/laca/workflow/Snakefile' --configfile '/home/config.yaml' --use-conda --conda-prefix '/home/tmp_db/conda_envs' --use-singularity --singularity-prefix '/home/tmp_db/singularity_envs' --singularity-args '--bind /tmp/repo/laca/workflow/resources/guppy_barcoding/:/opt/ont/guppy/data/barcoding/,/home/tmp_fastq' --rerun-triggers mtime --rerun-incomplete --scheduler greedy --jobs 6 --nolock --resources mem=957 mem_mb=980158 java_mem=813 ' returned non-zero exit status 1. (laca.py:113)
it's almost there. It's probably related to the version of medaka
.
Could you please the medaka
log file (logs/kmerCon/pooled_2_0_1cand1/medaka_1.log)?
The file logs/kmerCon/pooled_2_0_1cand1/medaka_1.log
is empty.
If I try to log into the docker and check the medaka version, but it's not visible:
alessiomilanese:tmp$ docker run -it -v `pwd`:/home --privileged yanhui09/laca
(base) root@09c5be3d3b8e:/home# medaka
bash: medaka: command not found
(base) root@09c5be3d3b8e:/home# medaka_consensus
bash: medaka_consensus: command not found
But maybe it is in an environment?
yes. it's installed in the conda environment, i.e., tmp_db/conda_envs/d9e9e3d72b467b116a163d017358d209_
I found an error in the log.
/usr/bin/bash: line 7: LD_LIBRARY_PATH: unbound variable
I just tried to re-install/re-load medaka for laca in the base environment. it works fine. I will give another try for docker.
Hi
I think it's related to the latest version of medaka for docker use.
Fortunately, the LD_LIBRARY_PATH
is for cuda
use in medaka.
Since medaka-cpu is usually fast enough to generate a consensus from a read cluster. I have suspended the cuda use of medaka in laca in the latest version. And it works on my server.
You can pull the latest docker image and give it try.
Thanks for updating the docker image and for the prompt response and support. I re-run, and now it completed with no errors!
Here's what I have in the repository:
drwxr-xr-x 10 root root 4.0K Feb 2 10:01 benchmarks
drwxr-xr-x 5 root root 4.0K Feb 2 10:19 clust
-rw-r--r-- 1 root root 8.0K Feb 2 09:51 config.yaml
-rw-r--r-- 1 root root 52 Feb 2 10:02 count_matrix.tsv
drwxr-xr-x 5 root root 4.0K Feb 2 09:57 demultiplexed
drwxr-xr-x 3 root root 4.0K Feb 2 10:19 kmerCon
drwxr-xr-x 9 root root 4.0K Feb 2 10:01 logs
drwxr-xr-x 3 root root 4.0K Feb 2 10:19 qc
-rw-r--r-- 1 root root 0 Feb 2 09:57 .qc_DONE
drwxr-xr-x 2 root root 4.0K Feb 2 10:02 quant
-rw-r--r-- 1 root root 4.2K Feb 2 10:01 rep_seqs.fasta
drwxr-xr-x 9 root root 4.0K Feb 1 11:33 .snakemake
drwxr-xr-x 3 root root 4.0K Feb 2 10:01 taxonomy
-rw-r--r-- 1 root root 525 Feb 2 10:19 taxonomy.tsv
drwxr-xr-x 5 root root 4.0K Feb 2 10:02 tmp_db
drwxr-xr-x 2 alessiomilanese docker 4.0K Feb 2 09:49 tmp_fastq
drwxr-xr-x 3 root root 4.0K Feb 2 10:19 tree
-rw-r--r-- 1 root root 118 Feb 2 10:19 tree.nwk
Just few more questions:
1) Is it correct to have 4 OTUs as results from you test data?
2) I assume that if I have more fastq files in my input folder (-b
), then I will have more columns in the result file count_matrix.tsv
?
3) In the count matrix I have as name for the sample BRK13
, where was this specified? and how can I specify the names when I have as input multiple fastq files?
4) Where are the main results?
Checking the files I assume:
count matrix.tsv Contains the read counts for each OTU for the only sample we had
#OTU ID BRK13
OTU_1 29
OTU_2 790
OTU_3 298
OTU_4 74
taxonomy.tsv Contains the taxonomy annotation for each OTU
OTU_1 k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Pseudomonas;s__Pseudomonas_aeruginosa
OTU_2 k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Listeriaceae;g__Listeria;s__Listeria_monocytogenes
OTU_3 k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;f__Enterobacteriaceae;g__Salmonella;s__Salmonella_enterica
OTU_4 k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Lactobacillaceae;g__Limosilactobacillus;s__Lactobacillus_fermentum
rep_seqs.fasta Contains the actual sequences of the OTUs
>OTU_1
ATATTGGACAATGGGCGAAAGCCTGATCCAGCCATGCCGCGTGTGTGAAGAAGGTCTTCGGATTGTAAAGCACTTTAAGTTGGGAGGAAGGGCAGTAAGTTAATACCTTGCTGTTTTGACGTTACCAACAGAATAAGCACCGGCTAACTTCGTGCCAGCAGCCGCGGTAATACGAAGGGTGCAAGCGTTAATCGGAATTACTGGGCGTAAAGCGCGCGTAGGTGGTTCAGCAAGTTGGATGTGAAATCCCCGGGCTCAACCTGGGAACTGCATCCAAAACTACTGAGCTAGAGTACGGTAGAGGGTGGTGGAATTTCCTGTGTAGCGGTGAAATGCGTAGATATAGGAAGGAACACCAGTGGCGAAGGCGACCACCTGGACTGATACTGACACTGAGGTGCGAAAGCGTGGGGAGCAAACAGGATTAGATACCCTGGTAGTCCACGCCGTAAACGATGTCGACTAGCCGTTGGGATCCTTGAGATCTTAGTGGCGCAGCTAACGCGATAAGTCGACCGCCTGGGGAGTACGGCCGCAAGGTTAAAACTCAAATGAATTGACGGGGGCCCGCACAAGCGGTGGAGCATGTGGTTTAATTCGAAGCAACGCGAAGAACCTTACCTGGCCTTGACATGCTGAGAACTTTCCAGAGATGGATTGGTGCCTTCGGGAACTCAGACACAGGTGCTGCATGGCTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGTAACGAGCGCAACCCTTGTCCTTAGTTACCAGCACCTCGGGTGGGCACTCTAAGGAGACTGCCGGTGACAAACCGGAGGAAGGTGGGGATGACGTCAAGTCATCATGGCCCTTACGGCCAGGGCTACACACGTGCTACAATGGTCGGTACAAAGGGTTGCCAAGCCGCGAGGTGGAGCTAATCCCATAAAACCGATCGTAGTCCGGATCGCAGTCTGCAACTCGACTGCGTGAAGTCGGAATCGCTAGTAATCGTGAATCAGAATGTCACGGTGAATACGTTCCCGGGCCTTGTACACACCGCCCGTC
>OTU_2
GAGGCAGCAGTAGGGAATCTTCCGCAATGGACGAAAGTCTGACGGAGCAACGCCGCGTGTAGTGAAGAAGGTTTTCGGATCGTAAAGCTCTGTTGTTAGAGAAGAACAAGGATAAGAGTAACTGCTTGTCCCTTGACGGTATCTAACCAGAAAGCCACGGCTAACTACGTGCCAGCAGCCGCGGTAATACGTAGGTGGCAAGCGTTGTCCGGATTTATTGGGCGTAAAGCGCGCGCAGGCGGTTTTTAAGTCTGATGTGAAAGCCCCCGGCTCAACCGGGGAGGGTCATTGGAAACTGGAAGACTTGAGTGCAGAAGAGGAGAGTGGAATTCCACGTGTAGCGGTGAAATGCGTAGATATGTGGAGGAACACCAGTGGCGAAGGCGACTCTCTGGTCTGTAACTGACGCTGAGGCGCGAAAGCGTGGGGAGCAAACAGGATTAGATACCCTGGTAGTCCACGCCGTAAACGATGAGTGCTAAGTGTTAGGGGGTTTCCGCCCCTTAGTGCTGCAGCTAACGCATTAAGCACTCCGCCTGGGGAGTACGACCGCAAGGTTGAAACTCAAAGGAATTGACGGGGGCCCGCACAAGCGGTGGAGCATGTGGTTTAATTCGAAGCAACGCGAAGAACCTTACCAGGTCTTGACATCCTTTGACCACTCTAGAGATAGAGCTTTCCCTTCGGGGACAAAGTGACAGGTGGTGCATGGTTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTTGATTTTAGTTGCCAGCATTTAGTTGGGCACTCTAAGGTGACTGCCGGTGACAAACCGGAGGAAGGTGGGGATGACGTCAAATCATCATGCCCCTTATGACCTGGGCTACACACGTGCTACAATGGATAGTACAAAGGGTCGCGAAACCGCGAGGTGAAGCTAATCCCATAAAACTGTTCTCAGTTCGGATTGTAGGCTGCAACTCGCCTACATGAAGCCGGAATCGCTAGTAATCGTGGATCAGCATGCCACGGTGAATACGTTCCCGGGCCTTGTACACACCGCCCGTC
>OTU_3
AGTGGGGAATATTGCACAATGGGCGCAAGCCTGATGCAGCCATGCCGCGTGTATGAAGAAGGCCTTCGGGTTGTAAAGTACTTTCAGCGGGGAGGAAGGTGTTGTGGTTAATAACCGCTGCTCATTGACGTTACCCGCAGAAGAAGCACCGGCTAACTCCGTGCCAGCAGCCGCGGTAATACGGAGGGTGCAAGCGTTAATCGGAATTACTGGGCGTAAAGCGCACGCAGGCGGTTTGTCAAGTCGGATGTGAAATCCCCGGGCTCAACCTGGGAACTGCATCTGATACTGGCAGGCTTGAGTCTTGTAGAGGGGGGTAGAATTCCAGGTGTAGCGGTGAAATGCGTAGAGATCTGGAGGAATACCGGTGGCGAAGGCGGCCCCCTGGACAAAGACTGACGCTCAGGTGCGAAAGCGTGGGGAGCAAACAGGATTAGATACCCTGGTAGTCCACGCCGTAAACGATGTCTACTTGGAGGTTGTGCCCTTGAGGCGTGGCTTCCGGAGCTAACGCGTTAAGTCGACCGCCTGGGGAGTACGGCCGCAAGGTTAAAACTCAAATGAATTGACGGGGGCCCGCACAAGCGGTGGAGCATGTGGTTTAATTCGATGCAACGCGAAGAACCTTACCTGGTCTTGACATCCACAGAAGTTTCCAGAGATGAGATTGGTGCCTTCGGGAACTGTGAGACAGGTGCTGCATGGCTGTCGTCAGCTCGTGTTGTGAAATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTTATCCTTTGTTGCCAGCGGTTCGGCCGGGAACTCAAAGGAGACTGCCAGTGATAAACTGGAGGAAGGTGGGGATGACGTCAAGTCATCATGGCCCTTACGACCAGGGCTACACACGTGCTACAATGGCGCATACAAAGAGAAGCGACCTCGCGAGAGCAAGCGGACCTCATAAAGTGCGTCGTAGTCCGGATTGGAGTCTGCAACTCGACTCCATGAAGTCGGAATCGCTAGTAATCGTGGATCAGAATGCCACGGTGAATACG
>OTU_4
AATCTTCCACAATGGGCGCAAGCCTGATGGAGCAACACCGCGTGAGTGAAGAAGGGTTTCGGCTCGTAAAGCTCTGTTGTTAAAGAAGAACACGTATGAGAGTAACTGTTCATACGTTGACGGTATTTAACCAGAAAGTCACGGCTAACTACGTGCCAGCAGCCGCGGTAATACGTAGGTGGCAAGCGTTATCCGGATTTATTGGGCGTAAAGAGAGTGCAGGCGGTTTTCTAAGTCTGATGTGAAAGCCTTCGGCTTAACCGGAGAAGTGCATCGGAAACTGGATAACTTGAGTGCAGAAGAGGGTAGTGGAACTCCATGTGTAGCGGTGGAATGCGTAGATATATGGAAGAACACCAGTGGCGAAGGCGGCTACCTGGTCTGCAACTGACGCTGAGACTCGAAAGCATGGGTAGCGAACAGGATTAGATACCCTGGTAGTCCATGCCGTAAACGATGAGTGCTAGGTGTTGGAGGGTTTCCGCCCTTCAGTGCCGGAGCTAACGCATTAAGCACTCCGCCTGGGGAGTACGACCGCAAGGTTGAAACTCAAAGGAATTGACGGGGGCCCGCACAAGCGGTGGAGCATGTGGTTTAATTCGAAGCTACGCGAAGAACCTTACCAGGTCTTGACATCTTGCGCCAACCCTAGAGATAGGGCGTTTCCTTCGGGAACGCAATGACAGGTGGTGCATGGTCGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTTGTTACTAGTTGCCAGCATTAAGTTGGGCACTCTAGTGAGACTGCCGGTGACAAACCGGAGGAAGGTGGGGACGACGTCAGATCATCATGCCCCTTATGACCTGGGCTACACACGTGCTACAATGGACGGTACAACGAGTCGCGAACTCGCGAGGGCAAGCAAATCTCTTAAAACCGTTCTCAGTTCGGACTGCAGGCTGCAACTCGCCTGCACGAAGTCGGAATCGCTAGTAATCGCGGATCAGCATGCCGCGGTGAATACGTTCCCGGGCCTTGTACACACCGCCCGTC
tree.nwk Contains the phylogenetic tree
(('OTU_1':0.055790368,'OTU_3':0.09398297)1.000:0.04706135,('OTU_2':0.035200803,'OTU_4':0.066716778):0.074327542)root;
fastq
input. The column names are inferred from the demultiplexed
directory in the result directory.
(base) [yanhui@yan01 laca_docker]$ ls demultiplexed
BRK13 suspected unclassified barcoding_summary.txt read_processor_log-2024-02-06_07-40-43.log
guppy
or minibar
. If you use the ku
barcodes, you can use the default barcodes. The barcodes
are inferred in the demultiplexing process on the fastq
input.count_matrix.tsv
, taxonomy.tsv
, rep_seqs.fasta
and tree.nwk
are the most commonly used files for downstream analysis.Nice thanks!
Hello,
Thanks for providing this tool.
I was wondering if you could provide a test dataset with few fastq files and a database file. I'm having some problems running the tool using the docker container, and I wanted to know if it's an issue with my input or I'm doing something wrong.
In case you don't have a ftp server, you could use https://zenodo.org/ to store the files.