caleblareau / mgatk

mgatk: mitochondrial genome analysis toolkit
http://caleblareau.github.io/mgatk
MIT License
101 stars 27 forks source link

Error in rule make_depth_table: 'InputFiles' object has no attribute 'depths' #22

Closed heruiyang closed 4 years ago

heruiyang commented 4 years ago

Hi,

I'm trying to run the example for mgatk call using tests/humanbam, and I'm getting an error in rule make_depth_table that 'InputFiles' object has no attribute 'depths'. Here's the full output:

(mgatk) rh476@CFCE2:~$ mgatk call -i humanbam -o outdir -c 8 -g hg19 -n test -kd
Wed Aug 19 17:44:27 EDT 2020: mgatk v0.5.8
Wed Aug 19 17:44:27 EDT 2020: Found designated mitochondrial chromosome: chrM
Wed Aug 19 17:44:27 EDT 2020: Genotyping samples with 8 threads
Building DAG of jobs...
Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 8
Rules claiming more threads will be scaled down.
Job counts:
    count   jobs
    1   all
    1   make_sample_list
    1   process_one_sample
    3

ÆWed Aug 19 17:44:28 2020Å
rule process_one_sample:
    input: outdir/.internal/samples/MGH60-P6-A11.mito.bam.txt
    output: outdir/temp/ready_bam/MGH60-P6-A11.mito.qc.bam, outdir/temp/ready_bam/MGH60-P6-A11.mito.qc.bam.bai, outdir/qc/depth/MGH60-P6-A11.mito.depth.txt, outdir/temp/sparse_matrices/MGH60-P6-A11.mito.A.txt, outdir/temp/sparse_matrices/MGH60-P6-A11.mito.C.txt, outdir/temp/sparse_matrices/MGH60-P6-A11.mito.G.txt, outdir/temp/sparse_matrices/MGH60-P6-A11.mito.T.txt, outdir/temp/sparse_matrices/MGH60-P6-A11.mito.coverage.txt
    jobid: 2
    wildcards: sample=MGH60-P6-A11.mito

Using shell: /bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job counts:
    count   jobs
    1   all
    1   make_depth_table
    1   make_final_sparse_matrices
    3

ÆWed Aug 19 17:44:28 2020Å
rule make_depth_table:
    output: outdir/final/test.depthTable.txt
    jobid: 1

Job counts:
    count   jobs
    1   process_one_sample
    1
Job counts:
    count   jobs
    1   make_depth_table
    1
ÆWed Aug 19 17:44:29 2020Å
Error in rule make_depth_table:
    jobid: 0
    output: outdir/final/test.depthTable.txt

RuleException:
AttributeError in line 30 of /mnt/cfce-stor1/home/rh476/miniconda3/envs/mgatk/lib/python3.8/site-packages/mgatk/bin/snake/Snakefile.Gather:
'InputFiles' object has no attribute 'depths'
  File "/mnt/cfce-stor1/home/rh476/miniconda3/envs/mgatk/lib/python3.8/site-packages/snakemake/executors/__init__.py", line 2168, in run_wrapper
  File "/mnt/cfce-stor1/home/rh476/miniconda3/envs/mgatk/lib/python3.8/site-packages/mgatk/bin/snake/Snakefile.Gather", line 30, in __rule_make_depth_table
  File "/mnt/cfce-stor1/home/rh476/miniconda3/envs/mgatk/lib/python3.8/site-packages/snakemake/executors/__init__.py", line 529, in _callback
  File "/mnt/cfce-stor1/home/rh476/miniconda3/envs/mgatk/lib/python3.8/concurrent/futures/thread.py", line 57, in run
  File "/mnt/cfce-stor1/home/rh476/miniconda3/envs/mgatk/lib/python3.8/site-packages/snakemake/executors/__init__.py", line 515, in cached_or_run
  File "/mnt/cfce-stor1/home/rh476/miniconda3/envs/mgatk/lib/python3.8/site-packages/snakemake/executors/__init__.py", line 2199, in run_wrapper
Exiting because a job execution failed. Look above for error message
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /mnt/cfce-stor1/home/rh476/.snakemake/log/2020-08-19T174428.325464.snakemake.log
ÆWed Aug 19 17:44:35 2020Å
Finished job 2.
1 of 3 steps (33%) done

ÆWed Aug 19 17:44:35 2020Å
rule make_sample_list:
    input: outdir/qc/depth/MGH60-P6-A11.mito.depth.txt
    output: outdir/temp/scattered.allSamples.txt
    jobid: 1

Job counts:
    count   jobs
    1   make_sample_list
    1
ÆWed Aug 19 17:44:36 2020Å
Finished job 1.
2 of 3 steps (67%) done

ÆWed Aug 19 17:44:36 2020Å
localrule all:
    input: outdir/temp/scattered.allSamples.txt
    jobid: 0

ÆWed Aug 19 17:44:36 2020Å
Finished job 0.
3 of 3 steps (100%) done
Complete log: /mnt/cfce-stor1/home/rh476/.snakemake/log/2020-08-19T174428.332329.snakemake.log
Error in checkGrep(grep(".A.txt", files)) : 
  Improper folder specification; file missing / extra file present. See documentation
Calls: importMito -> checkGrep
Execution halted

Would you know what might be causing this?

Thanks, Ruiyang

caleblareau commented 4 years ago

Can you send an ls -lRh of the output directory?

heruiyang commented 4 years ago

Sure, here's the output:

(mgatk) rh476@CFCE2:~$ ls -lRh outdir
outdir:
total 4.0K
drwxrwxr-x 2 rh476 rh476 3 Aug 19 20:01 final
drwxrwxr-x 3 rh476 rh476 8 Aug 19 20:01 logs
drwxrwxr-x 4 rh476 rh476 4 Aug 19 20:01 qc

outdir/final:
total 75K
-rw-rw-r-- 1 rh476 rh476 119K Aug 19 20:01 chrM_refAllele.txt

outdir/logs:
total 7.5K
-rw-rw-r-- 1 rh476 rh476  406 Aug 19 20:01 base.mgatk.log
drwxrwxr-x 2 rh476 rh476    3 Aug 19 20:01 filterlogs
-rw-rw-r-- 1 rh476 rh476  434 Aug 19 20:01 test.parameters.txt
-rw-rw-r-- 1 rh476 rh476    0 Aug 19 20:01 test.snakemake_gather.log
-rw-rw-r-- 1 rh476 rh476    0 Aug 19 20:01 test.snakemake_scatter.log
-rw-rw-r-- 1 rh476 rh476 3.7K Aug 19 20:01 test.snakemake_scatter.stats

outdir/logs/filterlogs:
total 512
-rw-rw-r-- 1 rh476 rh476 24 Aug 19 20:01 MGH60-P6-A11.mito.filter.log

outdir/qc:
total 1.0K
drwxrwxr-x 2 rh476 rh476 3 Aug 19 20:01 depth
drwxrwxr-x 2 rh476 rh476 2 Aug 19 20:01 quality

outdir/qc/depth:
total 512
-rw-rw-r-- 1 rh476 rh476 25 Aug 19 20:01 MGH60-P6-A11.mito.depth.txt

outdir/qc/quality:
total 0
heruiyang commented 4 years ago

It seems like the issue is due to Snakefile.gather being run before Snakefile.scatter has created the required output files. Here's the log for Snakefile.scatter:

(mgatk) rh476@CFCE2:~$ cat /mnt/cfce-stor1/home/rh476/.snakemake/log/2020-08-21T095746.065846.snakemake.log
Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 64
Rules claiming more threads will be scaled down.
Job counts:
    count   jobs
    1   all
    1   make_sample_list
    1   process_one_sample
    3

[Fri Aug 21 09:57:46 2020]
rule process_one_sample:
    input: outdir/.internal/samples/MGH60-P6-A11.mito.bam.txt
    output: outdir/temp/ready_bam/MGH60-P6-A11.mito.qc.bam, outdir/temp/ready_bam/MGH60-P6-A11.mito.qc.bam.bai, outdir/qc/depth/MGH60-P6-A11.mito.depth.txt, outdir/temp/sparse_matrices/MGH60-P6-A11.mito.A.txt, outdir/temp/sparse_matrices/MGH60-P6-A11.mito.C.txt, outdir/temp/sparse_matrices/MGH60-P6-A11.mito.G.txt, outdir/temp/sparse_matrices/MGH60-P6-A11.mito.T.txt, outdir/temp/sparse_matrices/MGH60-P6-A11.mito.coverage.txt
    jobid: 2
    wildcards: sample=MGH60-P6-A11.mito

[Fri Aug 21 09:57:56 2020]
Finished job 2.
1 of 3 steps (33%) done

[Fri Aug 21 09:57:56 2020]
rule make_sample_list:
    input: outdir/qc/depth/MGH60-P6-A11.mito.depth.txt
    output: outdir/temp/scattered.allSamples.txt
    jobid: 1

[Fri Aug 21 09:57:56 2020]
Finished job 1.
2 of 3 steps (67%) done

[Fri Aug 21 09:57:56 2020]
localrule all:
    input: outdir/temp/scattered.allSamples.txt
    jobid: 0

[Fri Aug 21 09:57:56 2020]
Finished job 0.
3 of 3 steps (100%) done
Complete log: /mnt/cfce-stor1/home/rh476/.snakemake/log/2020-08-21T095746.065846.snakemake.log

And for Snakefile.gather:

(mgatk) rh476@CFCE2:~$ cat /mnt/cfce-stor1/home/rh476/.snakemake/log/2020-08-21T095746.065921.snakemake.log
Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job counts:
    count   jobs
    1   all
    1   make_depth_table
    1   make_final_sparse_matrices
    3

[Fri Aug 21 09:57:46 2020]
rule make_depth_table:
    output: outdir/final/test.depthTable.txt
    jobid: 1

Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /mnt/cfce-stor1/home/rh476/.snakemake/log/2020-08-21T095746.065921.snakemake.log

Rule make_depth_table is running before process_one_sample has created the required output files.

heruiyang commented 4 years ago

I've managed to solve the issue. It turns out that there's something wrong with os.system() on my python installation - piping output with &> doesn't work, and more importantly makes the os.system() call non-blocking, which is causing the issue. In any case, using the --snake-stdout flag resolves the problem.

caleblareau commented 4 years ago

Ah okay. I’ve seen something like this in other contexts. Thanks for reporting and figuring this out. Since you’ve dug into it a bit, do you have a sense of how I could better report this as an error?

On Aug 21, 2020, at 6:21 PM, Ruiyang He notifications@github.com<mailto:notifications@github.com> wrote:

I've managed to solve the issue. It turns out that there's something wrong with os.system() on my python installation - piping output with &> doesn't work, and more importantly makes the os.system() call non-blocking, which is causing the issue. In any case, using the --snake-stdout flag resolves the problem.

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/caleblareau/mgatk/issues/22#issuecomment-678574194, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AD32FYINZYVZBU4CBDDSIODSB4MTJANCNFSM4QFMOBXQ.

heruiyang commented 4 years ago

I think perhaps a message informing the user that snakemake has failed to generate required intermediate files and asking the user to check the snakemake logs could be useful.

Thanks, Ruiyang