smith-chem-wisc / Spritz

Software for RNA-Seq analysis to create sample-specific proteoform databases from RNA-Seq data
https://smith-chem-wisc.github.io/Spritz/
MIT License
7 stars 11 forks source link

(1) "Error waiting for container: invalid character 'u' looking for beginning of value" (2) "Could not execute because the application was not found or a compatible .NET SDK is not installed." #212

Closed animesh closed 3 years ago

animesh commented 3 years ago

Docker is running though... below is the full log

Command executing: Powershell.exe docker pull smithlab/spritz ;docker run --rm -i -t --name spritz-956943642 -v """Z:\AGS\AGS RNAseq CNIO\raw data:/app/analysis""" -v """Z:\AGS\AGS RNAseq CNIO\raw data\data:/app/data""" -v """Z:\AGS\AGS RNAseq CNIO\raw data\configs:/app/configs""" smithlab/spritz; docker stop spritz-956943642
Saving output to Z:\AGS\AGS RNAseq CNIO\raw data\workflow_2021-03-23-10-45-25.txt. Please monitor it there...

Using default tag: latest
latest: Pulling from smithlab/spritz
Digest: sha256:55172c3a6e32257f977c9512e473f647eaeab32e35b6d96341598d6b96f97615
Status: Image is up to date for smithlab/spritz:latest
docker.io/smithlab/spritz:latest
Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 12
Rules claiming more threads will be scaled down.
Conda environments: ignored
Job counts:
    count   jobs
    1   all
    1   base_recalibration
    1   build_transfer_mods
    1   call_gvcf_varaints
    1   call_vcf_variants
    1   download_snpeff
    1   final_vcf_naming
    1   finish_variants
    1   generate_reference_snpeff_database
    1   hisat2_groupmark_bam
    1   reference_protein_xml
    1   split_n_cigar_reads
    1   tmpdir
    1   transfer_modifications_variant
    1   variant_annotation_ref
    15

ue Mar 23 09:45:38 2021]
rule download_snpeff:
    output: SnpEff/snpEff.config, SnpEff/snpEff.jar, SnpEff_4.3_SmithChemWisc_v2.zip
    log: data/SnpEffInstall.log
    jobid: 5


ue Mar 23 09:45:38 2021]
rule build_transfer_mods:
    output: TransferUniProtModifications/TransferUniProtModifications/bin/Release/netcoreapp3.1/TransferUniProtModifications.dll
    log: data/TransferUniProtModifications.build.log
    jobid: 72
    benchmark: data/TransferUniProtModifications.build.benchmark


ue Mar 23 09:45:38 2021]
rule tmpdir:
    output: tmp, temporary
    log: data/tmpdir.log
    jobid: 68

Removing temporary output file temporary.
ue Mar 23 09:45:38 2021]
Finished job 68.
1 of 15 steps (7%) done

ue Mar 23 09:45:38 2021]
rule hisat2_groupmark_bam:
    input: analysis/align/combined.sorted.bam, tmp
    output: analysis/variants/combined.sorted.grouped.bam, analysis/variants/combined.sorted.grouped.bam.bai, analysis/variants/combined.sorted.grouped.marked.bam, analysis/variants/combined.sorted.grouped.marked.bam.bai, analysis/variants/combined.sorted.grouped.marked.metrics
    log: analysis/variants/combined.sorted.grouped.marked.log
    jobid: 16
    benchmark: analysis/variants/combined.sorted.grouped.marked.benchmark
    wildcards: dir=analysis
    resources: mem_mb=16000

ue Mar 23 09:45:50 2021]
Finished job 72.
2 of 15 steps (13%) done
Removing temporary output file SnpEff_4.3_SmithChemWisc_v2.zip.
ue Mar 23 09:46:31 2021]
Finished job 5.
3 of 15 steps (20%) done

ue Mar 23 09:46:31 2021]
rule generate_reference_snpeff_database:
    input: SnpEff/snpEff.jar, data/ensembl/Homo_sapiens.GRCh38.97.gff3, data/ensembl/Homo_sapiens.GRCh38.pep.all.fa, data/ensembl/Homo_sapiens.GRCh38.dna.primary_assembly.karyotypic.fa
    output: SnpEff/data/Homo_sapiens.GRCh38/protein.fa, SnpEff/data/Homo_sapiens.GRCh38/genes.gff, SnpEff/data/genomes/Homo_sapiens.GRCh38.fa, SnpEff/data/Homo_sapiens.GRCh38/doneHomo_sapiens.GRCh38.txt
    log: SnpEff/data/Homo_sapiens.GRCh38/snpeffdatabase.log
    jobid: 4
    benchmark: SnpEff/data/Homo_sapiens.GRCh38/snpeffdatabase.benchmark
    resources: mem_mb=16000

ue Mar 23 09:50:32 2021]
Finished job 4.
4 of 15 steps (27%) done

ue Mar 23 09:50:32 2021]
rule reference_protein_xml:
    input: SnpEff/data/Homo_sapiens.GRCh38/doneHomo_sapiens.GRCh38.txt, SnpEff/snpEff.jar, data/ensembl/Homo_sapiens.GRCh38.dna.primary_assembly.karyotypic.fa, TransferUniProtModifications/TransferUniProtModifications/bin/Release/netcoreapp3.1/TransferUniProtModifications.dll, data/uniprot/Homo_sapiens.protein.xml.gz
    output: analysis/variants/doneHomo_sapiens.GRCh38.97.txt, analysis/variants/Homo_sapiens.GRCh38.97.protein.xml, analysis/variants/Homo_sapiens.GRCh38.97.protein.xml.gz, analysis/variants/Homo_sapiens.GRCh38.97.protein.fasta, analysis/variants/Homo_sapiens.GRCh38.97.protein.withdecoys.fasta, analysis/variants/Homo_sapiens.GRCh38.97.protein.withmods.xml, analysis/variants/Homo_sapiens.GRCh38.97.protein.withmods.xml.gz
    log: analysis/variants/Homo_sapiens.GRCh38.97.spritz.log
    jobid: 74
    benchmark: analysis/variants/Homo_sapiens.GRCh38.97.spritz.benchmark
    wildcards: dir=analysis
    resources: mem_mb=16000

Removing temporary output file analysis/variants/Homo_sapiens.GRCh38.97.protein.xml.
Removing temporary output file analysis/variants/Homo_sapiens.GRCh38.97.protein.withmods.xml.
ue Mar 23 10:08:21 2021]
Finished job 74.
5 of 15 steps (33%) done
time="2021-03-23T15:01:58+01:00" level=error msg="error waiting for container: invalid character 'u' looking for beginning of value"
Done!
acesnik commented 3 years ago

Hmm, I've never seen that one before. Waiting for 'u' is a particularly odd error message, too!

It seems like this is a Docker issue that's still open: https://github.com/docker/for-mac/issues/5139, dealing with the VM freezing. Have you gotten this error multiple times?

animesh commented 3 years ago

Yes, it comes and goes... when i restart, it seems to get stuck in following steps... never made it to the end of this data, I can probably try to ditch the docker and run natively? Is there some "dryrun" output of spritz which I can go one step at a time?

Using default tag: latest
latest: Pulling from smithlab/spritz
Digest: sha256:55172c3a6e32257f977c9512e473f647eaeab32e35b6d96341598d6b96f97615
Status: Image is up to date for smithlab/spritz:latest
docker.io/smithlab/spritz:latest
Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 12
Rules claiming more threads will be scaled down.
Conda environments: ignored
Job counts:
    count   jobs
    1   all
    1   base_recalibration
    1   build_transfer_mods
    1   call_gvcf_varaints
    1   call_vcf_variants
    1   copy_gff3_to_snpeff
    1   custom_protein_xml
    1   download_snpeff
    1   final_vcf_naming
    1   finish_isoform
    1   finish_isoform_variants
    1   finish_variants
    1   generate_reference_snpeff_database
    1   generate_snpeff_database
    1   hisat2_groupmark_bam
    1   reference_protein_xml
    1   split_n_cigar_reads
    1   tmpdir
    1   transfer_modifications_isoformvariant
    1   transfer_modifications_variant
    1   variant_annotation_custom
    1   variant_annotation_ref
    22

ri Mar 26 10:06:55 2021]
rule tmpdir:
    output: tmp, temporary
    log: data/tmpdir.log
    jobid: 68


ri Mar 26 10:06:55 2021]
rule build_transfer_mods:
    output: TransferUniProtModifications/TransferUniProtModifications/bin/Release/netcoreapp3.1/TransferUniProtModifications.dll
    log: data/TransferUniProtModifications.build.log
    jobid: 72
    benchmark: data/TransferUniProtModifications.build.benchmark


ri Mar 26 10:06:55 2021]
rule download_snpeff:
    output: SnpEff/snpEff.config, SnpEff/snpEff.jar, SnpEff_4.3_SmithChemWisc_v2.zip
    log: data/SnpEffInstall.log
    jobid: 5

Removing temporary output file temporary.
ri Mar 26 10:06:55 2021]
Finished job 68.
1 of 22 steps (5%) done

ri Mar 26 10:06:55 2021]
rule copy_gff3_to_snpeff:
    input: analysis/isoforms/combined.transcripts.genome.cds.gff3
    output: SnpEff/data/combined.transcripts.genome.gff3/genes.gff
    log: SnpEff/data/combined.transcripts.genome.gff3/copy_gff3_to_snpeff.log
    jobid: 77


ri Mar 26 10:06:55 2021]
rule hisat2_groupmark_bam:
    input: analysis/align/combined.sorted.bam, tmp
    output: analysis/variants/combined.sorted.grouped.bam, analysis/variants/combined.sorted.grouped.bam.bai, analysis/variants/combined.sorted.grouped.marked.bam, analysis/variants/combined.sorted.grouped.marked.bam.bai, analysis/variants/combined.sorted.grouped.marked.metrics
    log: analysis/variants/combined.sorted.grouped.marked.log
    jobid: 16
    benchmark: analysis/variants/combined.sorted.grouped.marked.benchmark
    wildcards: dir=analysis
    resources: mem_mb=16000

ri Mar 26 10:06:57 2021]
Finished job 77.
2 of 22 steps (9%) done
ri Mar 26 10:07:06 2021]
Finished job 72.
3 of 22 steps (14%) done
Removing temporary output file SnpEff_4.3_SmithChemWisc_v2.zip.
ri Mar 26 10:07:53 2021]
Finished job 5.
4 of 22 steps (18%) done

ri Mar 26 10:07:53 2021]
rule generate_reference_snpeff_database:
    input: SnpEff/snpEff.jar, data/ensembl/Homo_sapiens.GRCh38.97.gff3, data/ensembl/Homo_sapiens.GRCh38.pep.all.fa, data/ensembl/Homo_sapiens.GRCh38.dna.primary_assembly.karyotypic.fa
    output: SnpEff/data/Homo_sapiens.GRCh38/protein.fa, SnpEff/data/Homo_sapiens.GRCh38/genes.gff, SnpEff/data/genomes/Homo_sapiens.GRCh38.fa, SnpEff/data/Homo_sapiens.GRCh38/doneHomo_sapiens.GRCh38.txt
    log: SnpEff/data/Homo_sapiens.GRCh38/snpeffdatabase.log
    jobid: 4
    benchmark: SnpEff/data/Homo_sapiens.GRCh38/snpeffdatabase.benchmark
    resources: mem_mb=16000


ri Mar 26 10:07:53 2021]
rule generate_snpeff_database:
    input: SnpEff/snpEff.jar, SnpEff/data/combined.transcripts.genome.gff3/genes.gff, data/ensembl/Homo_sapiens.GRCh38.pep.all.fa, data/ensembl/Homo_sapiens.GRCh38.dna.primary_assembly.karyotypic.fa
    output: SnpEff/data/combined.transcripts.genome.gff3/protein.fa, SnpEff/data/genomes/combined.transcripts.genome.gff3.fa, SnpEff/data/combined.transcripts.genome.gff3/done.txt
    log: SnpEff/data/combined.transcripts.genome.gff3/snpeffdatabase.log
    jobid: 111
    benchmark: SnpEff/data/combined.transcripts.genome.gff3/snpeffdatabase.benchmark
    resources: mem_mb=16000

ri Mar 26 10:10:40 2021]
Finished job 111.
5 of 22 steps (23%) done

ri Mar 26 10:10:40 2021]
rule custom_protein_xml:
    input: SnpEff/snpEff.jar, data/ensembl/Homo_sapiens.GRCh38.dna.primary_assembly.karyotypic.fa, SnpEff/data/combined.transcripts.genome.gff3/genes.gff, SnpEff/data/combined.transcripts.genome.gff3/protein.fa, SnpEff/data/genomes/combined.transcripts.genome.gff3.fa, SnpEff/data/combined.transcripts.genome.gff3/done.txt, TransferUniProtModifications/TransferUniProtModifications/bin/Release/netcoreapp3.1/TransferUniProtModifications.dll, data/uniprot/Homo_sapiens.protein.xml.gz
    output: analysis/isoforms/combined.spritz.isoform.protein.xml, analysis/isoforms/combined.spritz.isoform.protein.withdecoys.fasta, analysis/isoforms/combined.spritz.isoform.protein.xml.gz, analysis/isoforms/combined.spritz.isoform.protein.withmods.xml, analysis/isoforms/combined.spritz.isoform.protein.withmods.xml.gz, analysis/isoforms/combined.spritz.isoform.protein.fasta
    log: analysis/isoforms/combined.spritz.isoform.log
    jobid: 114
    benchmark: analysis/isoforms/combined.spritz.isoform.benchmark
    wildcards: dir=analysis
    resources: mem_mb=16000

ri Mar 26 10:12:46 2021]
Finished job 4.
6 of 22 steps (27%) done

ri Mar 26 10:12:46 2021]
rule reference_protein_xml:
    input: SnpEff/data/Homo_sapiens.GRCh38/doneHomo_sapiens.GRCh38.txt, SnpEff/snpEff.jar, data/ensembl/Homo_sapiens.GRCh38.dna.primary_assembly.karyotypic.fa, TransferUniProtModifications/TransferUniProtModifications/bin/Release/netcoreapp3.1/TransferUniProtModifications.dll, data/uniprot/Homo_sapiens.protein.xml.gz
    output: analysis/variants/doneHomo_sapiens.GRCh38.97.txt, analysis/variants/Homo_sapiens.GRCh38.97.protein.xml, analysis/variants/Homo_sapiens.GRCh38.97.protein.xml.gz, analysis/variants/Homo_sapiens.GRCh38.97.protein.fasta, analysis/variants/Homo_sapiens.GRCh38.97.protein.withdecoys.fasta, analysis/variants/Homo_sapiens.GRCh38.97.protein.withmods.xml, analysis/variants/Homo_sapiens.GRCh38.97.protein.withmods.xml.gz
    log: analysis/variants/Homo_sapiens.GRCh38.97.spritz.log
    jobid: 74
    benchmark: analysis/variants/Homo_sapiens.GRCh38.97.spritz.benchmark
    wildcards: dir=analysis
    resources: mem_mb=16000

Removing temporary output file analysis/isoforms/combined.spritz.isoform.protein.xml.
Removing temporary output file analysis/isoforms/combined.spritz.isoform.protein.withmods.xml.
ri Mar 26 10:15:16 2021]
Finished job 114.
7 of 22 steps (32%) done

ri Mar 26 10:15:16 2021]
rule finish_isoform:
    input: analysis/isoforms/combined.spritz.isoform.protein.fasta, analysis/isoforms/combined.spritz.isoform.protein.withdecoys.fasta, analysis/isoforms/combined.spritz.isoform.protein.withmods.xml.gz
    output: analysis/final/combined.spritz.isoform.protein.fasta, analysis/final/combined.spritz.isoform.protein.withdecoys.fasta, analysis/final/combined.spritz.isoform.protein.withmods.xml.gz
    log: analysis/isoforms/finish_isoform.log
    jobid: 113
    wildcards: dir=analysis

ri Mar 26 10:15:17 2021]
Finished job 113.
8 of 22 steps (36%) done
Removing temporary output file analysis/variants/Homo_sapiens.GRCh38.97.protein.xml.
Removing temporary output file analysis/variants/Homo_sapiens.GRCh38.97.protein.withmods.xml.
ri Mar 26 10:19:50 2021]
Finished job 74.
9 of 22 steps (41%) done
acesnik commented 3 years ago

Sorry about Docker being flakey. That's frustrating.

For the time being, I'd recommend using the commandline version, https://github.com/smith-chem-wisc/Spritz/wiki/Spritz-commandline-usage. You should be able to specify the directory you're using for the analysis in the "analysisDirectory" config.yaml specification.

animesh commented 3 years ago

Thanks @acesnik , just to confirm before i proceed, I see there a config.yaml which I am guessing is created by the GUI?

$find . -iname config.yaml
./configs/config.yaml

$cat ./configs/config.yaml
version: 1
sra: []
sra_se: []
fq: [22286_CGATGT_C5E7AANXX_5_20141008B_20141008.bam., 22287_TGACCA_C5E7AANXX_5_20141008B_20141008.bam., 22288_ACAGTG_C5E7AANXX_5_20141008B_20141008.bam., 22289_GCCAAT_C5E7AANXX_5_20141008B_20141008.bam., 22290_CAGATC_C5E7AANXX_5_20141008B_20141008.bam., 22291_CTTGTA_C5E7AANXX_5_20141008B_20141008.bam., 22292_AGTCAA_C5E7AANXX_5_20141008B_20141008.bam., 22293_AGTTCC_C5E7AANXX_5_20141008B_20141008.bam., 22294_ATGTCA_C5E7AANXX_6_20141008B_20141008.bam., 22295_CCGTCC_C5E7AANXX_6_20141008B_20141008.bam., 22296_GTCCGC_C5E7AANXX_6_20141008B_20141008.bam., 22297_GTGAAA_C5E7AANXX_6_20141008B_20141008.bam., 22298_ATCACG_C5E7AANXX_6_20141008B_20141008.bam., 22299_TTAGGC_C5E7AANXX_6_20141008B_20141008.bam., 22300_ACTTGA_C5E7AANXX_6_20141008B_20141008.bam., 22301_GATCAG_C5E7AANXX_6_20141008B_20141008.bam., 22302_TAGCTT_C5E7AANXX_7_20141008B_20141008.bam., 22303_GGCTAC_C5E7AANXX_7_20141008B_20141008.bam., 22304_GTGGCC_C5E7AANXX_7_20141008B_20141008.bam., 22305_GTTTCG_C5E7AANXX_7_20141008B_20141008.bam., 22306_CGTACG_C5E7AANXX_7_20141008B_20141008.bam., 22307_GAGTGG_C5E7AANXX_7_20141008B_20141008.bam., 22308_ACTGAT_C5E7AANXX_7_20141008B_20141008.bam., 22309_ATTCCT_C5E7AANXX_7_20141008B_20141008.bam.]
fq_se: []
analysisDirectory: [analysis]
release: "97"
species: "Homo_sapiens"
organism: "human"
genome: "GRCh38"
analyses: [variant, isoform]
spritzversion: "0.2.4"
...
~                                                                                                                                                                          ~                 

I am wondering if I need to change the analysisDirectory: [analysis] to analysisDirectory: [$PWD] and can I go beyond 12 thread which seemed to be limit in the GUI for the following invocation from the $PWD? Specifically, if this will restart the process from where GUI left snakemake -j 24 --resources mem_mb=64000 ?

acesnik commented 3 years ago

The analysisDirectory in this case should be the absolute path to the directory that has the FASTQs in it. This should be a different directory than the one with the Snakefile, which is where you will run the snakemake command.

The snakemake command you listed looks good.

animesh commented 3 years ago

Thanks @acesnik for looking into this 👍🏼 but I am not sure where to run the snakemake command from? I tried to find the makefile

(spritz) animeshs@DMED7596:~/rnAGS$ find . -iname "*snake*"

or the workflow

(spritz) animeshs@DMED7596:~/rnAGS$ find . -iname "*work*"
./workflow_2021-01-28-12-29-46.txt
./workflow_2021-01-29-10-20-21.txt
./workflow_2021-01-31-13-16-21.txt
./workflow_2021-01-31-17-27-10.txt
./workflow_2021-02-02-11-09-48.txt
./workflow_2021-02-04-12-46-07.txt
./workflow_2021-02-04-13-20-47.txt
./workflow_2021-02-06-14-07-34.txt
./workflow_2021-02-06-18-14-16.txt
./workflow_2021-02-08-09-00-33.txt
./workflow_2021-02-10-10-56-29.txt
./workflow_2021-02-15-15-49-22.txt
./workflow_2021-02-23-12-32-18.txt
./workflow_2021-02-25-11-34-18.txt
./workflow_2021-02-25-17-52-46.txt
./workflow_2021-03-02-11-21-18.txt
./workflow_2021-03-07-14-33-17.txt
./workflow_2021-03-08-17-28-24.txt
./workflow_2021-03-19-11-57-04.txt
./workflow_2021-03-23-10-45-25.txt
./workflow_2021-03-26-11-06-45.txt

without success? Any ideas where it might be or which directory to initiate the command from?

acesnik commented 3 years ago

Try searching for the Snakefile and run it from that directory. It looks like you got the spritz environment set up, so that's good! That environment.yaml file is in the same folder as the Snakefile that is the working directory.

acesnik commented 3 years ago

In other words, assuming your git clone is named Spritz, you should be able to run it in Spritz/Spritz where the Snakefile is located. https://github.com/smith-chem-wisc/Spritz/tree/master/Spritz

animesh commented 3 years ago

Looks like (spritz) animeshs@DMED7596:~/Spritz/Spritz$ snakemake -j 24 --resources mem_mb=64000 1>log.1.txt 2>log.2.txt 0>log.0.txt >> log.txt & seems to have worked but it didn't seem to start from where it left (attached log log.2.txt) in the GUI version? Also there is some error like message in the log

(base) animeshs@DMED7596:~$ tail -f Spritz/Spritz/log.2.txt
Write-protected output files for rule reference_protein_xml:
/home/animeshs/rnAGS/variants/doneHomo_sapiens.GRCh38.97.txt
/home/animeshs/rnAGS/variants/Homo_sapiens.GRCh38.97.protein.xml.gz
/home/animeshs/rnAGS/variants/Homo_sapiens.GRCh38.97.protein.fasta
/home/animeshs/rnAGS/variants/Homo_sapiens.GRCh38.97.protein.withdecoys.fasta
/home/animeshs/rnAGS/variants/Homo_sapiens.GRCh38.97.protein.withmods.xml.gz
  File "/home/animeshs/miniconda3/envs/spritz/lib/python3.8/site-packages/snakemake/executors/__init__.py", line 131, in run_jobs
  File "/home/animeshs/miniconda3/envs/spritz/lib/python3.8/site-packages/snakemake/executors/__init__.py", line 433, in run
  File "/home/animeshs/miniconda3/envs/spritz/lib/python3.8/site-packages/snakemake/executors/__init__.py", line 225, in _run
  File "/home/animeshs/miniconda3/envs/spritz/lib/python3.8/site-packages/snakemake/executors/__init__.py", line 150, in _run

but htop shows stuff are running? image wondering if I set things up correctly in first place though?

acesnik commented 3 years ago

Oh, I see, I think it started earlier than you were before because the resources that were downloaded into the docker container (genome, gene model, etc) were downloaded again, so the timestamps of those resources are later than the previous place you were at...

Regarding the files that are write-protected, I wonder if there are any hanging docker containers. You can see if there are any still running with docker container ls.

I think the best thing to do now would be to just to let it run, or to start it again with the --keep-going flag, i.e. ~/Spritz/Spritz$ snakemake -j 24 --keep-going --resources mem_mb=64000 1>log.1.txt 2>log.2.txt 0>log.0.txt >> log.txt since you got an error with the reference_protein_xml database rule. That flag will let it go as far as possible towards the other final databases as it get and ignore that reference_protein_xml write error if it keeps popping up.

animesh commented 3 years ago

I reboot the machine, so docker is out, alteast that's what I get when

C:\Users\animeshs\GD\scripts>docker container ls
error during connect: This error may indicate that the docker daemon is not running.: Get http://%2F%2F.%2Fpipe%2Fdocker_engine/v1.24/containers/json: open //./pipe/docker_engine: The system cannot find the file specified.

BTW the last command crashed with message log.2.txt , now trying --keep-going flag, keeping fingers crossed...

acesnik commented 3 years ago

Okay! Hoping for the best! Thanks for the patience with this one!

acesnik commented 3 years ago

Did the rest of the run go okay for you?

animesh commented 3 years ago

It crashed with that "Write-protected" log.2.txt error , tried chown -R which didn't work, probably need to restart the machine but waiting for some other work to finish first... is there anyways to go over this without restarting?

acesnik commented 3 years ago

Is it possible to remove these files manually?

rm -f /home/animeshs/rnAGS/variants/doneHomo_sapiens.GRCh38.97.txt
rm -f /home/animeshs/rnAGS/variants/Homo_sapiens.GRCh38.97.protein.xml.gz
rm -f /home/animeshs/rnAGS/variants/Homo_sapiens.GRCh38.97.protein.fasta
rm -f /home/animeshs/rnAGS/variants/Homo_sapiens.GRCh38.97.protein.withdecoys.fasta
rm -f /home/animeshs/rnAGS/variants/Homo_sapiens.GRCh38.97.protein.withmods.xml.gz
animesh commented 3 years ago

Now its blaming "Write-protected output files for rule hisat2_align_bam_fq: /home/animeshs/rnAGS/align/22308_ACTGAT_C5E7AANXX_7_20141008B_20141008.bam..fq.sorted.bam" , below is the invocation and log:

(spritz) animeshs@DMED7596:~/Spritz/Spritz$ snakemake -j 24 --keep-going --resources mem_mb=64000 1>log.1.txt 2>log.2.txt 0>log.0.txt >> log.txt &
[1] 23965
(spritz) animeshs@DMED7596:~/Spritz/Spritz$ tail -f log.*
==> log.0.txt <==

==> log.1.txt <==

==> log.2.txt <==
Building DAG of jobs...

==> log.txt <==

==> log.2.txt <==
Using shell: /bin/bash
Provided cores: 24
Rules claiming more threads will be scaled down.
Provided resources: mem_mb=64000
Conda environments: ignored
Job counts:
        count   jobs
        1       LongOrfs
        1       Predict
        1       all
        24      assemble_transcripts_fq
        1       base_recalibration
        1       blastp
        1       call_gvcf_varaints
        1       call_vcf_variants
        1       cdna_alignment_orf_to_genome_orf
        1       copy_gff3_to_snpeff
        1       custom_protein_xml
        1       final_vcf_naming
        1       finish_isoform
        1       finish_isoform_variants
        1       finish_variants
        1       generate_snpeff_database
        1       gtf_file_to_cDNA_seqs
        1       gtf_to_alignment_gff3
        24      hisat2_align_bam_fq
        1       hisat2_groupmark_bam
        1       hisat2_merge_bams
        1       merge_transcripts
        1       reference_protein_xml
        1       remove_exon_and_utr_information
        1       split_n_cigar_reads
        1       transfer_modifications_isoformvariant
        1       transfer_modifications_variant
        1       variant_annotation_custom
        1       variant_annotation_ref
        75
ProtectedOutputException in line 202 of /home/animeshs/Spritz/Spritz/rules/align.smk:
Write-protected output files for rule hisat2_align_bam_fq:
/home/animeshs/rnAGS/align/22308_ACTGAT_C5E7AANXX_7_20141008B_20141008.bam..fq.sorted.bam
  File "/home/animeshs/miniconda3/envs/spritz/lib/python3.8/site-packages/snakemake/executors/__init__.py", line 131, in run_jobs
  File "/home/animeshs/miniconda3/envs/spritz/lib/python3.8/site-packages/snakemake/executors/__init__.py", line 433, in run
  File "/home/animeshs/miniconda3/envs/spritz/lib/python3.8/site-packages/snakemake/executors/__init__.py", line 225, in _run
  File "/home/animeshs/miniconda3/envs/spritz/lib/python3.8/site-packages/snakemake/executors/__init__.py", line 150, in _run
acesnik commented 3 years ago

I've never seen this before. Are the other snakemake processes still hanging around for some reason?

acesnik commented 3 years ago

ps aux | grep snakemake

animesh commented 3 years ago

There are couple of other process but i think these are unrelated?

(spritz) animeshs@DMED7596:~/Spritz/Spritz$ ps aux | grep snakemake
animeshs  1959  0.0  0.0   4636     0 pts/1    S+   Mar29   0:00 /bin/sh -c snakemake --snakefile /home/animeshs/miniconda3/envs/atlas/lib/python3.6/site-packages/atlas/Snakefile --directory /mnt/z/ayu  --rerun-incomplete --configfile '/mnt/z/ayu/config.yaml' --nolock   --use-conda --conda-prefix /mnt/z/ayu/databases/conda_envs   all
animeshs  1960  3.8  0.0 1895004    4 pts/1    Sl+  Mar29 491:32 /home/animeshs/miniconda3/envs/atlas/bin/python3.6 /home/animeshs/miniconda3/envs/atlas/bin/snakemake --snakefile /home/animeshs/miniconda3/envs/atlas/lib/python3.6/site-packages/atlas/Snakefile --directory /mnt/z/ayu --rerun-incomplete --configfile /mnt/z/ayu/config.yaml --nolock --use-conda --conda-prefix /mnt/z/ayu/databases/conda_envs all
animeshs 24722  0.0  0.0  14872  2252 pts/2    S+   09:24   0:00 grep --color=auto snakemake
acesnik commented 3 years ago

Hmm, yeah that looks like it's from a pipeline named atlas. If you named the spritz environment atlas instead, you might be able to cancel those.

animesh commented 3 years ago

so you think conda envs are cross-contaminating?

acesnik commented 3 years ago

No, I don't think they're colliding. I really don't know why those output files are write protected. That's pretty mysterious! I do think restarting after the other runs finish is a good idea.

animesh commented 3 years ago

Yes that is the plan, will get back once through 👍🏼

animesh commented 3 years ago

I eventually ended up deleting the whole/home/animeshs/rnAGS/align/and restarting but then it fails with Error in rule reference_protein_xml log.2.txt , further digging into the Homo_sapiens.GRCh38.97.spritz.log dotnet's System.NullReferenceException , any ideas what might be a way forward? Do I need to install mono in WSL?

acesnik commented 3 years ago

Are you using version 0.2.4? That looks familiar from errors we were getting in v0.2.3.

acesnik commented 3 years ago

mono won't help here, as it only works with .NET Framework applications, and this is targeting a .NET Core framework.

animesh commented 3 years ago

Just running conda update --all bring the Spritz to latest? What the best way to check the version? The git log shows:

commit d48529ec60331be1875ce8376bbeb8cc9b426b9b (HEAD -> master, origin/master, origin/HEAD)
Author: Anthony <cesnik@wisc.edu>
Date:   Thu Mar 18 13:05:41 2021 -0500

    Use sra-tools 2.10.1 for prefetch/fastq-dump (#209)

    * add openssh

    * use 2.10.1 for sra-tools
acesnik commented 3 years ago

That is the latest commit. Thanks for checking!

I'll look into this more later today.

acesnik commented 3 years ago

Sorry for the delay on this. I took a look but didn't get very far. I'm traveling now, so I'll be able to take a closer look in a couple weeks.

acesnik commented 3 years ago

Hi @animesh, I was able to replicate the error in reference_protein_xml, and I believe it is now fixed in this PR https://github.com/smith-chem-wisc/Spritz/pull/211. You could try re-running this pipeline with the current versions of Spritz.

animesh commented 3 years ago

thank @acesnik for getting back 👍🏽 somehow i managed to lose that old session of wsl so had to redo the setup but the data+partial results are the same except for the config

cd /home/animeshs/Spritz/Spritz/workflow
(spritzbase) animeshs@DMED7596:~/Spritz/Spritz/workflow$ cat config/config.yaml
sra: [] #  paired-end SRAs, comma separated, can leave empty, e.g. SRR629563
fq: [22286_CGATGT_C5E7AANXX_5_20141008B_20141008.bam., 22287_TGACCA_C5E7AANXX_5_20141008B_20141008.bam., 22288_ACAGTG_C5E7AANXX_5_20141008B_20141008.bam., 22289_GCCAAT_C5E7AANXX_5_20141008B_20141008.bam., 22290_CAGATC_C5E7AANXX_5_20141008B_20141008.bam., 22291_CTTGTA_C5E7AANXX_5_20141008B_20141008.bam., 22292_AGTCAA_C5E7AANXX_5_20141008B_20141008.bam., 22293_AGTTCC_C5E7AANXX_5_20141008B_20141008.bam., 22294_ATGTCA_C5E7AANXX_6_20141008B_20141008.bam., 22295_CCGTCC_C5E7AANXX_6_20141008B_20141008.bam., 22296_GTCCGC_C5E7AANXX_6_20141008B_20141008.bam., 22297_GTGAAA_C5E7AANXX_6_20141008B_20141008.bam., 22298_ATCACG_C5E7AANXX_6_20141008B_20141008.bam., 22299_TTAGGC_C5E7AANXX_6_20141008B_20141008.bam., 22300_ACTTGA_C5E7AANXX_6_20141008B_20141008.bam., 22301_GATCAG_C5E7AANXX_6_20141008B_20141008.bam., 22302_TAGCTT_C5E7AANXX_7_20141008B_20141008.bam., 22303_GGCTAC_C5E7AANXX_7_20141008B_20141008.bam., 22304_GTGGCC_C5E7AANXX_7_20141008B_20141008.bam., 22305_GTTTCG_C5E7AANXX_7_20141008B_20141008.bam., 22306_CGTACG_C5E7AANXX_7_20141008B_20141008.bam., 22307_GAGTGG_C5E7AANXX_7_20141008B_20141008.bam., 22308_ACTGAT_C5E7AANXX_7_20141008B_20141008.bam., 22309_ATTCCT_C5E7AANXX_7_20141008B_20141008.bam.] # paired-end fastq prefixes, comma separated, can leave empty, e.g. TestPairedEnd
sra_se: [] # single-end SRAs, comma separated, can leave empty, e.g. SRR8070095
fq_se: [] # single-end fastq prefixes, comma separated, can leave empty, e.g. TestSingleEnd
analysis_directory: [/home/animeshs/rnAGS] # for paths to drive e.g. /mnt/c/AnalysisFolder
species: "Homo_sapiens" # ensembl species name
genome: "GRCh38" # ensembl genome version
release: "100" # ensembl release version
organism: "human" # based on uniprot
# analyses: [isoform,variant,quant] # isoform construction, variant calling, transcript quantification
analyses: [variant, isoform]
spritz_version: "0.3.3" # should be the same here, common.smk, and MainWindow.xml.cs
prebuilt_spritz_mods: False
(spritzbase) animeshs@DMED7596:~/Spritz/Spritz/workflow$

where release: "100" is bumped up, hope that is fine? If so, then when i run

`snakemake -j 12 --keep-going --resources mem_mb=64000

I get following error in ../resources/SpritzModifications.build.log

Could not execute because the application was not found or a compatible .NET SDK is not installed.
Possible reasons for this include:
  * You intended to execute a .NET program:
      The application 'restore' does not exist.
  * You intended to execute a .NET SDK command:
      It was not possible to find any installed .NET SDKs.
      Install a .NET SDK from:
        https://aka.ms/dotnet-download

do i need DotNET?

acesnik commented 3 years ago

Hi @animesh, could you please use the command snakemake -j 12 --keep-going --resources mem_mb=64000 --use-conda --conda-frontend mamba?

Those last two options are new with the recent overhaul to improve the setup speed, and they should get it to work.

animesh commented 3 years ago

Looks like it went past that issue but it does say Job failed, going on with independent jobs ? Was not sure which log is relevant so i have tar-gzipped them log.zip let me know if it works? Could it be because of bumping the genome version to 100 from 97/98 i used earlier in re-running the pipeline?

acesnik commented 3 years ago

Interesting. It looks like some of the transcript assemblies are failing to build.

Could you send me the log files at /home/animeshs/rnAGS/*/*.log? For example, it looks like /home/animeshs/rnAGS/isoforms/22289_GCCAAT_C5E7AANXX_5_20141008B_20141008.bam..fq.sorted.gtf.log is one for a failed run.

acesnik commented 3 years ago

I don't think the issue is from bumping the genome version. I've seen these issues before when the aligned read counts are low for some reason.

acesnik commented 3 years ago

You could also look at some of the log files from aligning these files to see if that's the case, e.g. if there are low read counts. In a typical experiment, I would expect >80% or >90% of reads to be aligned. I've seen these types of errors when there are <20% aligned, for example, which points to an alignment issue.

acesnik commented 3 years ago

If spritz cannot detect isoforms with stringtie, I'd recommend just finishing the pipeline with the variant analysis. That is, you can try removing isoform from the config file's requested analyses to see if it finishes the job.

animesh commented 3 years ago

OK, i have disabled isoform call and reran the workflow

(spritzbase) animeshs@DMED7596:~/Spritz/Spritz/workflow$ vim config/config.yaml
(spritzbase) animeshs@DMED7596:~/Spritz/Spritz/workflow$ snakemake -j 12 --keep-going --resources mem_mb=64000 --use-conda --conda-frontend mamba
Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 12
Rules claiming more threads will be scaled down.
Provided resources: mem_mb=64000
Job counts:
        count   jobs
        1       all
        1       base_recalibration
        1       call_gvcf_varaints
        1       call_vcf_variants
        1       final_vcf_naming
        1       finish_variants
        1       reference_protein_xml
        1       split_n_cigar_reads
        1       transfer_modifications_variant
        1       variant_annotation_ref
        10

[Mon Aug  9 13:32:16 2021]
rule split_n_cigar_reads:
    input: /home/animeshs/rnAGS/variants/combined.sorted.grouped.marked.bam, ../resources/ensembl/Homo_sapiens.GRCh38.dna.primary_assembly.karyotypic.fa, ../resources/ensembl/Homo_sapiens.GRCh38.dna.primary_assembly.karyotypic.fa.fai, ../resources/ensembl/Homo_sapiens.GRCh38.dna.primary_assembly.karyotypic.dict, ../resources/tmp
    output: /home/animeshs/rnAGS/variants/combined.fixedQuals.bam, /home/animeshs/rnAGS/variants/combined.sorted.grouped.marked.split.bam, /home/animeshs/rnAGS/variants/combined.sorted.grouped.marked.split.bam.bai
    log: /home/animeshs/rnAGS/variants/combined.sorted.grouped.marked.split.log
    jobid: 20
    benchmark: /home/animeshs/rnAGS/variants/combined.sorted.grouped.marked.split.benchmark
    wildcards: dir=/home/animeshs/rnAGS
    resources: mem_mb=24000

[Mon Aug  9 13:32:16 2021]
rule reference_protein_xml:
    input: ptmlist.txt, PSI-MOD.obo.xml, ../resources/SnpEff/data/Homo_sapiens.GRCh38/doneHomo_sapiens.GRCh38.txt, ../resources/SnpEff/snpEff.jar, ../resources/ensembl/Homo_sapiens.GRCh38.dna.primary_assembly.karyotypic.fa, ../SpritzModifications/bin/x64/Release/net5.0/SpritzModifications.dll, ../resources/uniprot/Homo_sapiens.protein.xml.gz
    output: /home/animeshs/rnAGS/variants/doneHomo_sapiens.GRCh38.100.txt, /home/animeshs/rnAGS/variants/Homo_sapiens.GRCh38.100.protein.xml, /home/animeshs/rnAGS/variants/Homo_sapiens.GRCh38.100.protein.xml.gz, /home/animeshs/rnAGS/variants/Homo_sapiens.GRCh38.100.protein.fasta, /home/animeshs/rnAGS/variants/Homo_sapiens.GRCh38.100.protein.withdecoys.fasta, /home/animeshs/rnAGS/variants/Homo_sapiens.GRCh38.100.protein.withmods.xml, /home/animeshs/rnAGS/variants/Homo_sapiens.GRCh38.100.protein.withmods.xml.gz
    log: /home/animeshs/rnAGS/variants/Homo_sapiens.GRCh38.100.spritz.log
    jobid: 2
    benchmark: /home/animeshs/rnAGS/variants/Homo_sapiens.GRCh38.100.spritz.benchmark
    wildcards: dir=/home/animeshs/rnAGS
    resources: mem_mb=16000

Activating conda environment: /home/animeshs/Spritz/Spritz/workflow/.snakemake/conda/8d931524

but it crashed the WSL itself... BTW cat /home/animeshs/rnAGS/isoforms/*bam..fq.sorted.gtf.log returned empty, i guess it is because of the rerunning? what is the most optimal way to check for mapped read counts ?

acesnik commented 3 years ago

Thanks for giving that a try!

Could you tell me more about the WSL crash? If I remember correctly, you have 64 GB of RAM, so that's probably not the issue...

Thanks for the information about the *.sorted.gtf.log files being empty.

Please check on the files at /home/animeshs/rnAGS/align/*.hisat2.log to see information about the alignment rates.

animesh commented 3 years ago

Yes i have 64GB RAM free but it looks like snakemake directory lock was the issue as a rerun after invoking with snakemake with --unlocksnakemake -j 12 --keep-going --resources mem_mb=64000 --use-conda --conda-frontend mamba is still going strong, keeping fingers crossed! BTW

(base) animeshs@DMED7596:~$ cat /home/animeshs/rnAGS/align/22289_GCCAAT_C5E7AANXX_5_20141008B_20141008.bam..fq.hisat2.log
34632639 reads; of these:
  34632639 (100.00%) were paired; of these:
    2271032 (6.56%) aligned concordantly 0 times
    29827102 (86.12%) aligned concordantly exactly 1 time
    2534505 (7.32%) aligned concordantly >1 times
    ----
    2271032 pairs aligned concordantly 0 times; of these:
      167458 (7.37%) aligned discordantly 1 time
    ----
    2103574 pairs aligned 0 times concordantly or discordantly; of these:
      4207148 mates make up the pairs; of these:
        3544017 (84.24%) aligned 0 times
        513094 (12.20%) aligned exactly 1 time
        150037 (3.57%) aligned >1 times
94.88% overall alignment rate
[bam_sort_core] merging from 23 files and 1 in-memory blocks...

Thanks be to you @acesnik for making&sharing, may the force be with you 👍🏽

acesnik commented 3 years ago

Okay, great! No problem!

I'm going to close this issue. Feel free to reopen or open a new one if you run into anything new!

animesh commented 3 years ago

It went fine for a while but then crashed 2021-08-09T133906.922634.snakemake.log , looks like the underlying issue is something to do with GATK/htsjdk.samtools.util.RuntimeIOException combined.sorted.grouped.marked.split.log more RAM needed?

acesnik commented 3 years ago

It looks like the drive where spritz is located may have run out of storage space. I think this happens when it's writing temporary files and the temporary file location (../resources/tmp) runs out of space.

Could you share /home/animeshs/rnAGS/variants/Homo_sapiens.GRCh38.100.spritz.log, as well?

animesh commented 3 years ago

I think drive space should not be the issue as

(spritzbase) animeshs@DMED7596:~/Spritz/Spritz/workflow$ df -kh
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdb        251G  216G   23G  91% /
none             43G  4.0K   43G   1% /mnt/wsl
tools           953G  746G  208G  79% /init
none             43G     0   43G   0% /dev
none             43G     0   43G   0% /run
none             43G     0   43G   0% /run/lock
none             43G  8.0K   43G   1% /run/shm
none             43G     0   43G   0% /run/user
tmpfs            43G     0   43G   0% /sys/fs/cgroup
drivers         953G  746G  208G  79% /usr/lib/wsl/drivers
lib             953G  746G  208G  79% /usr/lib/wsl/lib
drvfs           953G  746G  208G  79% /mnt/c
drvfs           3.7T  3.4T  263G  93% /mnt/f
drvfs           9.1T  4.5T  4.7T  49% /mnt/z

/mnt/z has the data... looking into the log file you asked for Homo_sapiens.GRCh38.100.spritz.log which seems to me complaining about a DLL?

acesnik commented 3 years ago

The latter issue in Homo_sapiens.GRCh38.100.spritz.log was fixed here: https://github.com/smith-chem-wisc/Spritz/pull/217 Could you try doing git fetch --all; git pull origin master to see if those changes help?

acesnik commented 3 years ago

Is your /home directory with spritz also on /mnt/z? If not, I'd recommend running spritz from /mnt/z, as well.

With it there where there's plenty of space, I really don't know why a file is closing in the middle of the SplitNCigarReads tool execution. Could you please check whether temporary files being saved to ../resources/tmp on /mnt/z/ within the Spritz folder?

I just double checked the option for specifying the temporary directory (--tmp-dir), and it looks like that is still correct, so I don't think that's the issue.

animesh commented 3 years ago

OK, i have pull/move and re-running the pipeline

(base) animeshs@DMED7596:~/Spritz$ git fetch --all
Fetching origin
(base) animeshs@DMED7596:~/Spritz$ git pull origin master
From https://github.com/animesh/Spritz
 * branch            master     -> FETCH_HEAD
(base) animeshs@DMED7596:~/Spritz$ cd ..git pull origin master
(base) animeshs@DMED7596:~/Spritz$ cd ..
(base) animeshs@DMED7596:~$mv Spritz /mnt/z/.
(base) animeshs@DMED7596:~$cd /mnt/z/Spritz/Spritz/workflow
(base) animeshs@DMED7596:/mnt/z/Spritz/Spritz/workflow$ conda activate spritzbase
(spritzbase) animeshs@DMED7596:/mnt/z/Spritz/Spritz/workflow$ snakemake -j 12 --keep-going --resources mem_mb=64000 --use-conda --conda-frontend mamba
Building DAG of jobs...
Creating conda environment envs/proteogenomics.yaml...
Downloading and installing remote packages.
...

but after this it keeps crashing at 1st step? Should i just redo or something else i can looking into for saving the work so far?

acesnik commented 3 years ago

I'm not sure what's going wrong based on that output. Is it producing any log files? I also wonder whether the analysis directory is specified as an absolute path in the config file.

animesh commented 3 years ago

Looks like it is getting stuck and conda activation stage, below is the one i re-ran and just cancelled, it was running since yesterday...

(spritzbase) animeshs@DMED7596:/mnt/z/Spritz/Spritz/workflow$ snakemake -j 8 --unlock --keep-going --resources mem_mb=32000 --use-conda --conda-frontend mamba
Unlocking working directory.
(spritzbase) animeshs@DMED7596:/mnt/z/Spritz/Spritz/workflow$ snakemake -j 8 --keep-going --resources mem_mb=32000 --use-conda --conda-frontend mamba
Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 8
Rules claiming more threads will be scaled down.
Provided resources: mem_mb=32000
Job counts:
        count   jobs
        1       all
        1       base_recalibration
        1       call_gvcf_varaints
        1       call_vcf_variants
        1       final_vcf_naming
        1       finish_variants
        1       split_n_cigar_reads
        1       transfer_modifications_variant
        1       variant_annotation_ref
        9

[Thu Aug 12 14:36:34 2021]
rule split_n_cigar_reads:
    input: /home/animeshs/rnAGS/variants/combined.sorted.grouped.marked.bam, ../resources/ensembl/Homo_sapiens.GRCh38.dna.primary_assembly.karyotypic.fa, ../resources/ensembl/Homo_sapiens.GRCh38.dna.primary_assembly.karyotypic.fa.fai, ../resources/ensembl/Homo_sapiens.GRCh38.dna.primary_assembly.karyotypic.dict, ../resources/tmp
    output: /home/animeshs/rnAGS/variants/combined.fixedQuals.bam, /home/animeshs/rnAGS/variants/combined.sorted.grouped.marked.split.bam, /home/animeshs/rnAGS/variants/combined.sorted.grouped.marked.split.bam.bai
    log: /home/animeshs/rnAGS/variants/combined.sorted.grouped.marked.split.log
    jobid: 20
    benchmark: /home/animeshs/rnAGS/variants/combined.sorted.grouped.marked.split.benchmark
    wildcards: dir=/home/animeshs/rnAGS
    resources: mem_mb=24000

Activating conda environment: /mnt/z/Spritz/Spritz/workflow/.snakemake/conda/3ddf2249
^CTerminating processes on user request, this might take some time.
^CCancelling snakemake on user request.
[Fri Aug 13 09:33:02 2021]
Error in rule split_n_cigar_reads:
    jobid: 20
    output: /home/animeshs/rnAGS/variants/combined.fixedQuals.bam, /home/animeshs/rnAGS/variants/combined.sorted.grouped.marked.split.bam, /home/animeshs/rnAGS/variants/combined.sorted.grouped.marked.split.bam.bai
    log: /home/animeshs/rnAGS/variants/combined.sorted.grouped.marked.split.log (check log file(s) for error message)
    conda-env: /mnt/z/Spritz/Spritz/workflow/.snakemake/conda/3ddf2249
    shell:
        (gatk --java-options "-Xmx24000M -Dsamjdk.compression_level=9" FixMisencodedBaseQualityReads -I /home/animeshs/rnAGS/variants/combined.sorted.grouped.marked.bam -O /home/animeshs/rnAGS/variants/combined.fixedQuals.bam && gatk --java-options "-Xmx24000M -Dsamjdk.compression_level=9" SplitNCigarReads -R ../resources/ensembl/Homo_sapiens.GRCh38.dna.primary_assembly.karyotypic.fa -I /home/animeshs/rnAGS/variants/combined.fixedQuals.bam -O /home/animeshs/rnAGS/variants/combined.sorted.grouped.marked.split.bam --tmp-dir ../resources/tmp || gatk --java-options "-Xmx24000M -Dsamjdk.compression_level=9" SplitNCigarReads -R ../resources/ensembl/Homo_sapiens.GRCh38.dna.primary_assembly.karyotypic.fa -I /home/animeshs/rnAGS/variants/combined.sorted.grouped.marked.bam -O /home/animeshs/rnAGS/variants/combined.sorted.grouped.marked.split.bam --tmp-dir ../resources/tmp; samtools index /home/animeshs/rnAGS/variants/combined.sorted.grouped.marked.split.bam) &> /home/animeshs/rnAGS/variants/combined.sorted.grouped.marked.split.log
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Removing output files of failed job split_n_cigar_reads since they might be corrupted:
/home/animeshs/rnAGS/variants/combined.fixedQuals.bam, /home/animeshs/rnAGS/variants/combined.sorted.grouped.marked.split.bam
Job failed, going on with independent jobs.

Below is the config, it is a link to absolute pathj, should it be relative?

(spritzbase) animeshs@DMED7596:/mnt/z/Spritz/Spritz/workflow$ less config/config.yaml
sra: [] #  paired-end SRAs, comma separated, can leave empty, e.g. SRR629563
fq: [22286_CGATGT_C5E7AANXX_5_20141008B_20141008.bam., 22287_TGACCA_C5E7AANXX_5_20141008B_20141008.bam., 22288_ACAGTG_C5E7AANXX_5_20141008B_20141008.bam., 22289_GCCAAT_C5E7AANXX_5_20141008B_20141008.bam., 22290_CAGATC_C5E7AANXX_5_20141008B_20141008.bam., 22291_CTTGTA_C5E7AANXX_5_20141008B_20141008.bam., 22292_AGTCAA_C5E7AANXX_5_20141008B_20141008.bam., 22293_AGTTCC_C5E7AANXX_5_20141008B_20141008.bam., 22294_ATGTCA_C5E7AANXX_6_20141008B_20141008.bam., 22295_CCGTCC_C5E7AANXX_6_20141008B_20141008.bam., 22296_GTCCGC_C5E7AANXX_6_20141008B_20141008.bam., 22297_GTGAAA_C5E7AANXX_6_20141008B_20141008.bam., 22298_ATCACG_C5E7AANXX_6_20141008B_20141008.bam., 22299_TTAGGC_C5E7AANXX_6_20141008B_20141008.bam., 22300_ACTTGA_C5E7AANXX_6_20141008B_20141008.bam., 22301_GATCAG_C5E7AANXX_6_20141008B_20141008.bam., 22302_TAGCTT_C5E7AANXX_7_20141008B_20141008.bam., 22303_GGCTAC_C5E7AANXX_7_20141008B_20141008.bam., 22304_GTGGCC_C5E7AANXX_7_20141008B_20141008.bam., 22305_GTTTCG_C5E7AANXX_7_20141008B_20141008.bam., 22306_CGTACG_C5E7AANXX_7_20141008B_20141008.bam., 22307_GAGTGG_C5E7AANXX_7_20141008B_20141008.bam., 22308_ACTGAT_C5E7AANXX_7_20141008B_20141008.bam., 22309_ATTCCT_C5E7AANXX_7_20141008B_20141008.bam.] # paired-end fastq prefixes, comma separated, can leave empty, e.g. TestPairedEnd
sra_se: [] # single-end SRAs, comma separated, can leave empty, e.g. SRR8070095
fq_se: [] # single-end fastq prefixes, comma separated, can leave empty, e.g. TestSingleEnd
analysis_directory: [/home/animeshs/rnAGS] # for paths to drive e.g. /mnt/c/AnalysisFolder
species: "Homo_sapiens" # ensembl species name
genome: "GRCh38" # ensembl genome version
release: "100" # ensembl release version
organism: "human" # based on uniprot
# analyses: [isoform,variant,quant] # isoform construction, variant calling, transcript quantification
analyses: [variant]
spritz_version: "0.3.4" # should be the same here, common.smk, and MainWindow.xml.cs
prebuilt_spritz_mods: False
acesnik commented 3 years ago

You could try deleting the /mnt/z/Spritz/Spritz/workflow/.snakemake/ folder that has the environments and try again. It should just rebuild the environments quickly before running.