zavolanlab / mirflowz

Snakemake workflow for the mapping and quantification of miRNAs and isomiRs from miRNA-Seq libraries.
MIT License
6 stars 0 forks source link

Genome resource preparation fails if no isomiRs are to be created/analyzed #3

Closed deliaBlue closed 1 year ago

deliaBlue commented 1 year ago

MIRFLOWZ-prepare execution fails for the isomiR annotation processing steps when bp_5p and bp_3p are set to [0]:

Error in rule iso_anno_final:
    jobid: 15
    output: results/homo_sapiens/GRCh38.106/isomirs_annotation.bed
    log: logs/local/homo_sapiens/GRCh38.106/iso_anno_final.log (check log file(s) for error message)
    shell:
        (grep -v '5p0_3p0' results/homo_sapiens/GRCh38.106/iso_anno_concat.bed > results/homo_sapiens/GRCh38.106/isomirs_annotation.bed) &> logs/local/homo_sapiens/GRCh38.106/iso_anno_final.log
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
    cluster_jobid: Submitted batch job 62412752

Error executing rule iso_anno_final on cluster (jobid: 15, external: Submitted batch job 62412752, jobscript: $WORKDIR/.snakemake/tmp.89vdaq5h/snakejob.iso_anno_final.15.sh). For error details see the cluster log and the log files of the involved rule(s).

Cluster log:

JOB ID  62412752
==============================
rule    iso_anno_final
==============================
Building DAG of jobs...
Falling back to greedy scheduler because no default solver is found for pulp (you have to install either coincbc or glpk).
Using shell: /bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Select jobs to execute...

[Wed Oct 19 23:38:22 2022]
rule iso_anno_final:
    input: results/homo_sapiens/GRCh38.106/iso_anno_concat.bed
    output: results/homo_sapiens/GRCh38.106/isomirs_annotation.bed
    log: logs/local/homo_sapiens/GRCh38.106/iso_anno_final.log
    jobid: 0
    wildcards: organism=homo_sapiens/GRCh38.106
    resources: mem_mb=1000, disk_mb=1000, tmpdir=$TMPDIR/slurm-job.62412752

(grep -v '5p0_3p0' results/homo_sapiens/GRCh38.106/iso_anno_concat.bed > results/homo_sapiens/GRCh38.106/isomirs_annotation.bed) &> logs/local/homo_sapiens/GRCh38.106/iso_anno_final.log
Activating singularity image $WORKDIR/.snakemake/singularity/de7c7b4627830e4e3ea6242749e3f18c.simg
[Wed Oct 19 23:38:23 2022]
Error in rule iso_anno_final:
    jobid: 0
    output: results/homo_sapiens/GRCh38.106/isomirs_annotation.bed
    log: logs/local/homo_sapiens/GRCh38.106/iso_anno_final.log (check log file(s) for error message)
    shell:
        (grep -v '5p0_3p0' results/homo_sapiens/GRCh38.106/iso_anno_concat.bed > results/homo_sapiens/GRCh38.106/isomirs_annotation.bed) &> logs/local/homo_sapiens/GRCh38.106/iso_anno_final.log
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Removing output files of failed job iso_anno_final since they might be corrupted:
results/homo_sapiens/GRCh38.106/isomirs_annotation.bed
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
==============================

Likely the problem is that in the case where isomiR processing is essentially switched off, an empty file is created as input for rule iso_anno_final, causing the grep command to fail (grep returns a non-zero error code when it doesn't match any pattern!).

If so, a solution would be to add logic that prevents grep from failing when no match is found. See for example:

However, empty annotation files could also lead to problems further downstream, so this would need to be carefully tested.

uniqueg commented 1 year ago

This will likely be irrelevant for the new quantification design (see #7)