smith-chem-wisc / Spritz

Software for RNA-Seq analysis to create sample-specific proteoform databases from RNA-Seq data
https://smith-chem-wisc.github.io/Spritz/
MIT License
7 stars 11 forks source link

Error in rule reorder_genome_fasta #236

Closed animesh closed 11 months ago

animesh commented 11 months ago

I am facing this error with the latest version running in windows 11, any ideas how to proceed? Log report below

Command executing: Powershell.exe docker pull smithlab/spritz:0.3.10;docker run --rm -i -t --user=root --name spritz1843128396 -v """F:\TK:/app/spritz/results/""" -v """F:\resources:/app/spritz/resources""" smithlab/spritz:0.3.10 conda run --no-capture-output --live-stream dotnet SpritzCMD.dll --threads 6 --analysisDirectory=/app/spritz/results/ --reference="""release-97,homo_sapiens,human,GRCh38""" --analyzeVariants --fastq1=TK10_49 --fastq2=TK10_49 ; docker stop spritz1843128396
Saving output to F:\TK\workflow_2023-08-29-19-11-00.txt. Please monitor it there...

ca5806d5421b: Download complete
0d9226469454: Download complete
d569578b07e4: Download complete
4f4fb700ef54: Download complete
71fd4b2262a5: Download complete
2f649884f099: Download complete
9937597e7ab3: Download complete
30a8bd24115d: Download complete
0ce83ae0a9ff: Download complete
1d5252f66ea9: Download complete
9d03f36366cb: Download complete
97b4c65b128e: Download complete
9a78cf8be88c: Download complete
ede128c8803b: Download complete
39f29c8cb96a: Download complete
docker.io/smithlab/spritz:0.3.10
What's Next?
  View summary of image vulnerabilities and recommendations → docker scout quickview smithlab/spritz:0.3.10
[?1h=Welcome to Spritz!
Testing analysis directory /app/spritz/results/
Using analysis directory /app/spritz/results/
Running `snakemake -j 6 --use-conda --conda-frontend mamba --configfile /app/spritz/results/config/config.yaml`.
Building DAG of jobs...
Creating conda environment envs/spritzbase.yaml...
Downloading and installing remote packages.
Environment for envs/spritzbase.yaml created (location: .snakemake/conda/ce565c96)
Creating conda environment envs/align.yaml...
Downloading and installing remote packages.
Environment for envs/align.yaml created (location: .snakemake/conda/e361d902)
Creating conda environment envs/proteogenomics.yaml...
Downloading and installing remote packages.
Environment for envs/proteogenomics.yaml created (location: .snakemake/conda/47a7eecc)
Creating conda environment envs/default.yaml...
Downloading and installing remote packages.
Environment for envs/default.yaml created (location: .snakemake/conda/46c8395d)
Creating conda environment envs/downloads.yaml...
Downloading and installing remote packages.
Environment for envs/downloads.yaml created (location: .snakemake/conda/5fe6c4ba)
Creating conda environment envs/variants.yaml...
Downloading and installing remote packages.
Environment for envs/variants.yaml created (location: .snakemake/conda/180357af)
Using shell: /bin/bash
Provided cores: 6
Rules claiming more threads will be scaled down.
Job counts:
    count   jobs
    1   all
    1   base_recalibration
    1   call_gvcf_varaints
    1   call_vcf_variants
    1   dict_fa
    1   download_chromosome_mappings
    1   download_dbsnp_vcf
    1   download_ensembl_references
    1   download_protein_xml
    1   download_snpeff
    1   fastp_fq_uncompressed
    1   final_vcf_naming
    1   finish_variants
    1   generate_reference_snpeff_database
    1   hisat2_align_bam_fq
    1   hisat2_group
    1   hisat2_mark
    1   hisat2_merge_bams
    1   hisat2_splice_sites
    1   hisat_genome
    1   index_ensembl_vcf
    1   index_fa
    1   prose
    1   reference_protein_xml
    1   reorder_genome_fasta
    1   setup_ptmlist_links
    1   setup_transfer_mods
    1   split_n_cigar_reads
    1   transfer_modifications_variant
    1   variant_annotation_ref
    1   variant_tmpdir
    31

ue Aug 29 17:15:13 2023]
rule setup_transfer_mods:
    input: ../SpritzModifications.dll
    output: ../resources/ptmlist.txt, ../resources/PSI-MOD.obo.xml
    log: ../resources/setup_transfer_mods.log
    jobid: 4
    benchmark: ../resources/setup_transfer_mods.benchmark


ue Aug 29 17:15:13 2023]
rule download_ensembl_references:
    output: ../resources/ensembl/Homo_sapiens.GRCh38.dna.primary_assembly.fa, ../resources/ensembl/Homo_sapiens.GRCh38.97.gff3, ../resources/ensembl/Homo_sapiens.GRCh38.pep.all.fa
    log: ../resources/ensembl/downloads.log
    jobid: 7
    benchmark: ../resources/ensembl/downloads.benchmark


ue Aug 29 17:15:13 2023]
rule download_protein_xml:
    output: ../resources/uniprot/Homo_sapiens.protein.xml.gz, ../resources/uniprot/Homo_sapiens.protein.fasta
    log: ../resources/uniprot/Homo_sapiens.protein.xml.gz.log
    jobid: 9
    benchmark: ../resources/uniprot/Homo_sapiens.protein.xml.gz.benchmark


ue Aug 29 17:15:13 2023]
rule prose:
    output: ../results/prose.txt
    log: ../results/prose.log
    jobid: 1
    wildcards: dir=../results


ue Aug 29 17:15:13 2023]
rule variant_tmpdir:
    output: ../resources/tmp
    log: ../resources/tmpdir.log
    jobid: 27


ue Aug 29 17:15:13 2023]
rule download_snpeff:
    output: ../resources/SnpEff/snpEff.config, ../resources/SnpEff/snpEff.jar, ../resources/SnpEff_4.3_SmithChemWisc_v2.zip
    log: ../resources/SnpEffInstall.log
    jobid: 6

Activating conda environment: /app/spritz/workflow/.snakemake/conda/5fe6c4ba
Activating conda environment: /app/spritz/workflow/.snakemake/conda/ce565c96
Activating conda environment: /app/spritz/workflow/.snakemake/conda/180357af
Activating conda environment: /app/spritz/workflow/.snakemake/conda/5fe6c4ba
Activating conda environment: /app/spritz/workflow/.snakemake/conda/5fe6c4ba
Activating conda environment: /app/spritz/workflow/.snakemake/conda/46c8395d
ue Aug 29 17:15:14 2023]
Finished job 27.
1 of 31 steps (3%) done
ue Aug 29 17:15:14 2023]
Finished job 1.
2 of 31 steps (6%) done
ue Aug 29 17:15:17 2023]
Finished job 9.
3 of 31 steps (10%) done
ue Aug 29 17:15:19 2023]
Finished job 4.
4 of 31 steps (13%) done

ue Aug 29 17:15:19 2023]
rule download_chromosome_mappings:
    output: ../resources/ChromosomeMappings/GRCh38_UCSC2ensembl.txt
    log: ../resources/download_chromosome_mappings.log
    jobid: 16
    benchmark: ../resources/download_chromosome_mappings.benchmark


ue Aug 29 17:15:19 2023]
rule setup_ptmlist_links:
    input: ../resources/ptmlist.txt, ../resources/PSI-MOD.obo.xml
    output: ptmlist.txt, PSI-MOD.obo.xml
    log: ../resources/setup_transfer_mod_linking.log
    jobid: 3
    benchmark: ../resources/setup_transfer_mod_linking.benchmark

Activating conda environment: /app/spritz/workflow/.snakemake/conda/5fe6c4ba
Activating conda environment: /app/spritz/workflow/.snakemake/conda/47a7eecc
ue Aug 29 17:15:21 2023]
Finished job 3.
5 of 31 steps (16%) done
ue Aug 29 17:15:25 2023]
Finished job 16.
6 of 31 steps (19%) done

ue Aug 29 17:15:25 2023]
rule download_dbsnp_vcf:
    input: ../resources/ChromosomeMappings/GRCh38_UCSC2ensembl.txt
    output: ../resources/ensembl/Homo_sapiens.ensembl.vcf
    log: ../resources/ensembl/downloads_dbsnp_vcf.log
    jobid: 15
    benchmark: ../resources/ensembl/downloads_dbsnp_vcf.benchmark

Activating conda environment: /app/spritz/workflow/.snakemake/conda/5fe6c4ba
Removing temporary output file ../resources/SnpEff_4.3_SmithChemWisc_v2.zip.
ue Aug 29 17:20:41 2023]
Finished job 6.
7 of 31 steps (23%) done
ue Aug 29 17:27:01 2023]
Finished job 7.
8 of 31 steps (26%) done

ue Aug 29 17:27:01 2023]
rule hisat2_splice_sites:
    input: ../resources/ensembl/Homo_sapiens.GRCh38.97.gff3
    output: ../resources/ensembl/Homo_sapiens.GRCh38.97.splicesites.txt
    log: ../resources/ensembl/Homo_sapiens.GRCh38.97.splicesites.log
    jobid: 26


ue Aug 29 17:27:01 2023]
rule reorder_genome_fasta:
    input: ../resources/ensembl/Homo_sapiens.GRCh38.dna.primary_assembly.fa
    output: ../resources/ensembl/Homo_sapiens.GRCh38.dna.primary_assembly.karyotypic.fa
    log: ../resources/ensembl/karyotypic_order.log
    jobid: 8
    benchmark: ../resources/ensembl/karyotypic_order.benchmark

Activating conda environment: /app/spritz/workflow/.snakemake/conda/5fe6c4ba
Activating conda environment: /app/spritz/workflow/.snakemake/conda/e361d902
ue Aug 29 17:27:12 2023]
Finished job 26.
9 of 31 steps (29%) done
ue Aug 29 17:28:30 2023]
Error in rule reorder_genome_fasta:
    jobid: 8
    output: ../resources/ensembl/Homo_sapiens.GRCh38.dna.primary_assembly.karyotypic.fa
    log: ../resources/ensembl/karyotypic_order.log (check log file(s) for error message)
    conda-env: /app/spritz/workflow/.snakemake/conda/5fe6c4ba
    shell:
        python scripts/karyotypic_order.py 2> ../resources/ensembl/karyotypic_order.log
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Removing output files of failed job reorder_genome_fasta since they might be corrupted:
../resources/ensembl/Homo_sapiens.GRCh38.dna.primary_assembly.karyotypic.fa
acesnik commented 11 months ago

It looks like reorder_genome_fasta failed. Could you please send along the files ../resources/ensembl/karyotypic_order.benchmark and ../resources/ensembl/karyotypic_order.log?

animesh commented 11 months ago

Sure karyotypic_order - Copy.benchmark.txt and karyotypic_order - Copy.log.txt but it looks like the issue is probably harddisk space cos i had given about 40G? I have restarted with 120G and so far it seems to be going well, least no complains 🤞

 UID                 PID                 PPID                C                   STIME               TTY                 TIME                CMD
root                25267               25241               0                   09:01               ?                   00:00:00            /opt/conda/bin/python /opt/conda/bin/conda run --no-capture-output --live-stream dotnet SpritzCMD.dll --threads 6 --analysisDirectory=/app/spritz/results/ --reference=release-97,homo_sapiens,human,GRCh38 --analyzeVariants --fastq1=TK10_49 --fastq2=TK10_49
root                25304               25267               0                   09:01               ?                   00:00:00            /bin/bash /tmp/tmpdjts3fr5
root                25318               25304               0                   09:01               ?                   00:00:00            dotnet SpritzCMD.dll --threads 6 --analysisDirectory=/app/spritz/results/ --reference=release-97,homo_sapiens,human,GRCh38 --analyzeVariants --fastq1=TK10_49 --fastq2=TK10_49
root                25326               25318               0                   09:01               ?                   00:00:44            /opt/conda/bin/python3.8 /opt/conda/bin/snakemake -j 6 --use-conda --conda-frontend mamba --configfile /app/spritz/results/config/config.yaml
root                53909               25326               0                   12:33               ?                   00:00:00            /bin/bash -c source /opt/conda/bin/activate '/app/spritz/workflow/.snakemake/conda/180357af'; set -euo pipefail; (gatk --java-options "-Xmx24000M -Dsamjdk.compression_level=9" FixMisencodedBaseQualityReads -I ../results/variants/combined.sorted.grouped.marked.bam -O ../results/variants/combined.fixedQuals.bam && gatk --java-options "-Xmx24000M -Dsamjdk.compression_level=9" SplitNCigarReads -R ../resources/ensembl/Homo_sapiens.GRCh38.dna.primary_assembly.karyotypic.fa -I ../results/variants/combined.fixedQuals.bam -O ../results/variants/combined.sorted.grouped.marked.split.bam --tmp-dir ../resources/tmp || gatk --java-options "-Xmx24000M -Dsamjdk.compression_level=9" SplitNCigarReads -R ../resources/ensembl/Homo_sapiens.GRCh38.dna.primary_assembly.karyotypic.fa -I ../results/variants/combined.sorted.grouped.marked.bam -O ../results/variants/combined.sorted.grouped.marked.split.bam --tmp-dir ../resources/tmp; samtools index ../results/variants/combined.sorted.grouped.marked.split.bam) &> ../results/variants/combined.sorted.grouped.marked.split.log
root                53918               53909               0                   12:33               ?                   00:00:00            /bin/bash -c source /opt/conda/bin/activate '/app/spritz/workflow/.snakemake/conda/180357af'; set -euo pipefail; (gatk --java-options "-Xmx24000M -Dsamjdk.compression_level=9" FixMisencodedBaseQualityReads -I ../results/variants/combined.sorted.grouped.marked.bam -O ../results/variants/combined.fixedQuals.bam && gatk --java-options "-Xmx24000M -Dsamjdk.compression_level=9" SplitNCigarReads -R ../resources/ensembl/Homo_sapiens.GRCh38.dna.primary_assembly.karyotypic.fa -I ../results/variants/combined.fixedQuals.bam -O ../results/variants/combined.sorted.grouped.marked.split.bam --tmp-dir ../resources/tmp || gatk --java-options "-Xmx24000M -Dsamjdk.compression_level=9" SplitNCigarReads -R ../resources/ensembl/Homo_sapiens.GRCh38.dna.primary_assembly.karyotypic.fa -I ../results/variants/combined.sorted.grouped.marked.bam -O ../results/variants/combined.sorted.grouped.marked.split.bam --tmp-dir ../resources/tmp; samtools index ../results/variants/combined.sorted.grouped.marked.split.bam) &> ../results/variants/combined.sorted.grouped.marked.split.log
root                53946               53918               0                   12:33               ?                   00:00:00            python /app/spritz/workflow/.snakemake/conda/180357af/bin/gatk --java-options -Xmx24000M -Dsamjdk.compression_level=9 SplitNCigarReads -R ../resources/ensembl/Homo_sapiens.GRCh38.dna.primary_assembly.karyotypic.fa -I ../results/variants/combined.sorted.grouped.marked.bam -O ../results/variants/combined.sorted.grouped.marked.split.bam --tmp-dir ../resources/tmp
root                53947               53946               99                  12:33               ?                   00:10:01            java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx24000M -Dsamjdk.compression_level=9 -jar /app/spritz/workflow/.snakemake/conda/180357af/share/gatk4-4.2.0.0-1/gatk-package-4.2.0.0-local.jar SplitNCigarReads -R ../resources/ensembl/Homo_sapiens.GRCh38.dna.primary_assembly.karyotypic.fa -I ../results/variants/combined.sorted.grouped.marked.bam -O ../results/variants/combined.sorted.grouped.marked.split.bam --tmp-dir ../resources/tmp
acesnik commented 11 months ago

Thanks for the update. Feel free to reopen if this issue persists.