smith-chem-wisc / Spritz

Software for RNA-Seq analysis to create sample-specific proteoform databases from RNA-Seq data
https://smith-chem-wisc.github.io/Spritz/
MIT License
7 stars 11 forks source link

Could not find file common_all_20170710GRCh38.vcf #158

Closed animesh closed 4 years ago

animesh commented 4 years ago

Was checking out

animeshs@DMED7596:/mnt/c/Users/animeshs/Desktop/Documents/Spritz0.0.19$ ./CMD.exe -c proteins

but facing the issue:

Unhandled Exception: System.IO.FileNotFoundException: Could not find file 'C:\Users\animeshs\Desktop\Documents\Spritz0.0.19\common_all_20170710GRCh38.vcf'.
   at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
   at System.IO.FileStream.Init(String path, FileMode mode, FileAccess access, Int32 rights, Boolean useRights, FileShare share, Int32 bufferSize, FileOptions options, SECURITY_ATTRIBUTES secAttrs, String msgPath, Boolean bFromProxy, Boo
lean useLongPath, Boolean checkHost)
   at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, FileOptions options, String msgPath, Boolean bFromProxy, Boolean useLongPath, Boolean checkHost)
   at System.IO.StreamReader..ctor(String path, Encoding encoding, Boolean detectEncodingFromByteOrderMarks, Int32 bufferSize, Boolean checkHost)
   at System.IO.StreamReader..ctor(String path)
   at ToolWrapperLayer.GATKWrapper.ConvertVCFChromosomesUCSC2Ensembl(String spritzDirectory, String vcfPath, String reference, Boolean dryRun) in E:\source\repos\Spritz2\ToolWrapperLayer\GATKWrapper.cs:line 222
   at ToolWrapperLayer.GATKWrapper.DownloadEnsemblKnownVariantSites(String spritzDirectory, Boolean commonOnly, String reference, Boolean dryRun) in E:\source\repos\Spritz2\ToolWrapperLayer\GATKWrapper.cs:line 203
   at CMD.Spritz.Main(String[] args) in E:\source\repos\Spritz2\CMD\Spritz.cs:line 60
a

I do see it invoking another WSL windows: image which seems to download ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b150_GRCh38p7/VCF/GATK/common_all_20170710.vcf.gz and perhaps that's wherein the issue is?

animesh commented 4 years ago

update:

  1. downloading common_all_20170710.vcf.gz wget https://ftp.ncbi.nih.gov/snp/organisms/human_9606_b150_GRCh38p7/VCF/GATK/common_all_20170710.vcf.gz
  2. gunzip and renaming to common_all_20170710GRCh38.vcf
animesh commented 4 years ago

The GATK popup seems to have died out though with folllwing message in the parent terminal

C:\Users\animeshs\Desktop\Documents\Spritz0.0.19>CMD.exe -c proteins
Note: You can specify a UniProt XML file with the -x flag to transfer modificaitons and database references.

Unhandled Exception: System.ArgumentException: Error: experiment type was not recognized.
   at CMD.Spritz.Main(String[] args) in E:\source\repos\Spritz2\CMD\Spritz.cs:line 88

I guess it needs some files that was not downloaded during main setup and i failed to see it?

acesnik commented 4 years ago

Thanks for the bug report. We'll look into it. There are a lot of changes coming down the pipe, so hopefully this will be solved.

acesnik commented 4 years ago

Hi there, we rewrote the whole program, so feel free to give it another try and post any issues you encounter.

We've also released a preprint for Spritz, which you can find here: https://www.biorxiv.org/content/10.1101/2020.06.08.140681v1.

Thanks!

animesh commented 4 years ago

just gave it a shot with some paired-ends image and although it says done;

Command executing: Powershell.exe docker pull smithlab/spritz ; docker run --rm -t --name spritz -v """F:\docker:/app/analysis""" -v """F:\docker\data:/app/data""" -v """L:\promec\Animesh\Spritz_0_1_2\configs:/app/configs""" smithlab/spritz > """F:\docker\workflow_2020-06-11-13-39-52.txt"""; docker stop spritz
Saving output to F:\docker\workflow_2020-06-11-13-39-52.txt. Please monitor it there...

: Done!

but F:\docker\data seems to be empty, any ideas why?

acesnik commented 4 years ago

Thanks for trying it out!

Docker would need to have access to the F:\ and L:\ drives for that run. Could you please check that they are listed in in Settings -> Resources -> File Sharing within Docker Desktop?

Which organism and Ensembl version are you using? We're debugging some of those combinations; the most tried and tested right now is homo_sapiens with version 82.

acesnik commented 4 years ago

I found and fixed a small issue with the wildcards for using fastq inputs (https://github.com/smith-chem-wisc/Spritz/commit/b5cf04683509e3148f0ab2bc993a2be0d30b0eff), and I'm having success when mimicking your filenames.

image

Could you give this run another try? The new docker image should download automatically using the Spritz GUI.

animesh commented 4 years ago

Thanks for the suggestion :) At least F:\docker:\workflow_2020-06-16-10-02-33.txt is being written but its content complains about missing input files?

[33mwildcard constraints in inputs are ignored [0m [33mwildcard constraints in inputs are ignored [0m [33mBuilding DAG of jobs... [0m [31mMissingInputException in line 7 of /app/rules/qc.smk: Missing input files for rule expand_fastqs: analysis/combined_1.fastq.gz analysis/combined_2.fastq.gz [0m (spritz) root@24a822c6f087:/app#

Regards,

Ani

--------------------------"The Answer Lies In The Genome"--------------------------

[[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] -rms

On Mon, Jun 15, 2020 at 8:55 PM Anthony notifications@github.com wrote:

I found and fixed a small issue with the wildcards for using fastq inputs ( b5cf046 https://github.com/smith-chem-wisc/Spritz/commit/b5cf04683509e3148f0ab2bc993a2be0d30b0eff), and I'm having success when mimicing your filenames.

[image: image] https://user-images.githubusercontent.com/16342951/84694917-9eb12080-af0f-11ea-9345-59c531245640.png

Could you give this run another try? The new docker image should download automatically using the Spritz GUI.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/smith-chem-wisc/Spritz/issues/158#issuecomment-644316682, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAFHUL2PFWPH4MJXS7BEU3RWZVAZANCNFSM4JD52IQQ .

wildcard constraints in inputs are ignored wildcard constraints in inputs are ignored Building DAG of jobs... MissingInputException in line 7 of /app/rules/qc.smk: Missing input files for rule expand_fastqs: analysis/combined_1.fastq.gz analysis/combined_2.fastq.gz (spritz) root@24a822c6f087:/app#

acesnik commented 4 years ago

I was able to replicate the error. Working on it...

acesnik commented 4 years ago

I fixed the error you were experiencing in this pull request: https://github.com/smith-chem-wisc/Spritz/pull/176. It's ready to give it a try again. Thanks for your patience, @animesh!

acesnik commented 4 years ago

The updated Docker image should download automatically when you start the run in Spritz.

animesh commented 4 years ago

Thanks again for looking into this for me :) There are still some issues though "CalledProcessError in line 14 of /app/rules/downloads.smk" (details in attached log), any ideas?

On Thu, Jun 18, 2020 at 10:20 PM Anthony notifications@github.com wrote:

The updated Docker image should download automatically when you start the run in Spritz.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/smith-chem-wisc/Spritz/issues/158#issuecomment-646285234, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAFHUJPYIU73C3JQASBDX3RXJZHRANCNFSM4JD52IQQ .

wildcard constraints in inputs are ignored wildcard constraints in inputs are ignored Building DAG of jobs... Using shell: /bin/bash Provided cores: 24 Rules claiming more threads will be scaled down. Unlimited resources: mem_mb Job counts: count jobs 1 all 1 assemble_transcripts 1 base_recalibration 1 build_gtf_sharp 1 build_transfer_mods 1 call_gvcf_varaints 1 call_vcf_variants 1 custom_protein_xml 1 dict_fa 1 download_adapters 1 download_ensembl_references 1 download_protein_xml 1 download_snpeff 8 fastqc_analysis 1 filter_transcripts_add_cds 1 final_vcf_naming 1 generate_snpeff_database 8 hisat2_align_bam 1 hisat2_groupmark_bam 1 hisat2_merge_bams 1 hisat2_splice_sites 1 hisat_genome 1 index_ensembl_vcf 1 index_fa 1 reference_protein_xml 1 reorder_genome_fasta 8 skewer 1 snpeff_database_setup 1 split_n_cigar_reads 1 tmpdir 1 transfer_modifications_isoformvariant 1 transfer_modifications_variant 1 variant_annotation_custom 1 variant_annotation_ref 55  [Fri Jun 19 08:55:11 2020] rule download_protein_xml: output: data/uniprot/Homo_sapiens.protein.xml.gz jobid: 8   [Fri Jun 19 08:55:11 2020] rule download_ensembl_references: output: data/ensembl/Homo_sapiens.GRCh38.dna.primary_assembly.fa, data/ensembl/Homo_sapiens.GRCh38.82.gff3, data/ensembl/Homo_sapiens.GRCh38.pep.all.fa, data/ensembl/Homo_sapiens.vcf.gz, data/ensembl/Homo_sapiens.ensembl.vcf log: data/ensembl/downloads.log jobid: 15   [Fri Jun 19 08:55:11 2020] rule tmpdir: output: tmp, temporary jobid: 6   [Fri Jun 19 08:55:11 2020] rule build_gtf_sharp: output: GtfSharp/GtfSharp/bin/Release/netcoreapp2.1/GtfSharp.dll log: data/GtfSharp.build.log jobid: 18   [Fri Jun 19 08:55:11 2020] rule build_transfer_mods: output: TransferUniProtModifications/TransferUniProtModifications/bin/Release/netcoreapp2.1/TransferUniProtModifications.dll log: data/TransferUniProtModifications.build.log jobid: 7  [Fri Jun 19 08:55:11 2020] Finished job 6. 1 of 55 steps (2%) done  [Fri Jun 19 08:55:11 2020] rule download_adapters: output: BBMap, data/qc/adapters.fa jobid: 54   [Fri Jun 19 08:55:11 2020] rule download_snpeff: output: SnpEff/snpEff.config, SnpEff/snpEff.jar log: data/SnpEffInstall.log jobid: 9  Cloning into 'BBMap'... remote: Enumerating objects: 632, done. remote: Counting objects: 0% (1/632) remote: Counting objects: 1% (7/632) remote: Counting objects: 2% (13/632) remote: Counting objects: 3% (19/632) remote: Counting objects: 4% (26/632) remote: Counting objects: 5% (32/632) remote: Counting objects: 6% (38/632) remote: Counting objects: 7% (45/632) remote: Counting objects: 8% (51/632) remote: Counting objects: 9% (57/632) remote: Counting objects: 10% (64/632) remote: Counting objects: 11% (70/632) remote: Counting objects: 12% (76/632) remote: Counting objects: 13% (83/632) remote: Counting objects: 14% (89/632) remote: Counting objects: 15% (95/632) remote: Counting objects: 16% (102/632) remote: Counting objects: 17% (108/632) remote: Counting objects: 18% (114/632) remote: Counting objects: 19% (121/632) remote: Counting objects: 20% (127/632) remote: Counting objects: 21% (133/632) remote: Counting objects: 22% (140/632) remote: Counting objects: 23% (146/632) remote: Counting objects: 24% (152/632) remote: Counting objects: 25% (158/632) remote: Counting objects: 26% (165/632) remote: Counting objects: 27% (171/632) remote: Counting objects: 28% (177/632) remote: Counting objects: 29% (184/632) remote: Counting objects: 30% (190/632) remote: Counting objects: 31% (196/632) remote: Counting objects: 32% (203/632) remote: Counting objects: 33% (209/632) remote: Counting objects: 34% (215/632) remote: Counting objects: 35% (222/632) remote: Counting objects: 36% (228/632) remote: Counting objects: 37% (234/632) remote: Counting objects: 38% (241/632) remote: Counting objects: 39% (247/632) remote: Counting objects: 40% (253/632) remote: Counting objects: 41% (260/632) remote: Counting objects: 42% (266/632) remote: Counting objects: 43% (272/632) remote: Counting objects: 44% (279/632) remote: Counting objects: 45% (285/632) remote: Counting objects: 46% (291/632) remote: Counting objects: 47% (298/632) remote: Counting objects: 48% (304/632) remote: Counting objects: 49% (310/632) remote: Counting objects: 50% (316/632) remote: Counting objects: 51% (323/632) remote: Counting objects: 52% (329/632) remote: Counting objects: 53% (335/632) remote: Counting objects: 54% (342/632) remote: Counting objects: 55% (348/632) remote: Counting objects: 56% (354/632) remote: Counting objects: 57% (361/632) remote: Counting objects: 58% (367/632) remote: Counting objects: 59% (373/632) remote: Counting objects: 60% (380/632) remote: Counting objects: 61% (386/632) remote: Counting objects: 62% (392/632) remote: Counting objects: 63% (399/632) remote: Counting objects: 64% (405/632) remote: Counting objects: 65% (411/632) remote: Counting objects: 66% (418/632) remote: Counting objects: 67% (424/632) remote: Counting objects: 68% (430/632) remote: Counting objects: 69% (437/632) remote: Counting objects: 70% (443/632) remote: Counting objects: 71% (449/632) remote: Counting objects: 72% (456/632) remote: Counting objects: 73% (462/632) remote: Counting objects: 74% (468/632) remote: Counting objects: 75% (474/632) remote: Counting objects: 76% (481/632) remote: Counting objects: 77% (487/632) remote: Counting objects: 78% (493/632) remote: Counting objects: 79% (500/632) remote: Counting objects: 80% (506/632) remote: Counting objects: 81% (512/632) remote: Counting objects: 82% (519/632) remote: Counting objects: 83% (525/632) remote: Counting objects: 84% (531/632) remote: Counting objects: 85% (538/632) remote: Counting objects: 86% (544/632) remote: Counting objects: 87% (550/632) remote: Counting objects: 88% (557/632) remote: Counting objects: 89% (563/632) remote: Counting objects: 90% (569/632) remote: Counting objects: 91% (576/632) remote: Counting objects: 92% (582/632) remote: Counting objects: 93% (588/632) remote: Counting objects: 94% (595/632) remote: Counting objects: 95% (601/632) remote: Counting objects: 96% (607/632) remote: Counting objects: 97% (614/632) remote: Counting objects: 98% (620/632) remote: Counting objects: 99% (626/632) remote: Counting objects: 100% (632/632) remote: Counting objects: 100% (632/632), done. remote: Compressing objects: 0% (1/606) remote: Compressing objects: 1% (7/606) remote: Compressing objects: 2% (13/606) remote: Compressing objects: 3% (19/606) remote: Compressing objects: 4% (25/606) remote: Compressing objects: 5% (31/606) remote: Compressing objects: 6% (37/606) remote: Compressing objects: 7% (43/606) remote: Compressing objects: 8% (49/606) remote: Compressing objects: 9% (55/606) remote: Compressing objects: 10% (61/606) remote: Compressing objects: 11% (67/606) remote: Compressing objects: 12% (73/606) remote: Compressing objects: 13% (79/606) remote: Compressing objects: 14% (85/606) remote: Compressing objects: 15% (91/606) remote: Compressing objects: 16% (97/606) remote: Compressing objects: 17% (104/606) remote: Compressing objects: 18% (110/606) remote: Compressing objects: 19% (116/606) remote: Compressing objects: 20% (122/606) remote: Compressing objects: 21% (128/606) remote: Compressing objects: 22% (134/606) remote: Compressing objects: 23% (140/606) remote: Compressing objects: 24% (146/606) remote: Compressing objects: 25% (152/606) remote: Compressing objects: 26% (158/606) remote: Compressing objects: 27% (164/606) remote: Compressing objects: 28% (170/606) remote: Compressing objects: 29% (176/606) remote: Compressing objects: 30% (182/606) remote: Compressing objects: 31% (188/606) remote: Compressing objects: 32% (194/606) remote: Compressing objects: 33% (200/606) remote: Compressing objects: 34% (207/606) remote: Compressing objects: 35% (213/606) remote: Compressing objects: 36% (219/606) remote: Compressing objects: 37% (225/606) remote: Compressing objects: 38% (231/606) remote: Compressing objects: 39% (237/606) remote: Compressing objects: 40% (243/606) remote: Compressing objects: 41% (249/606) remote: Compressing objects: 42% (255/606) remote: Compressing objects: 43% (261/606) remote: Compressing objects: 44% (267/606) remote: Compressing objects: 45% (273/606) remote: Compressing objects: 46% (279/606) remote: Compressing objects: 47% (285/606) remote: Compressing objects: 48% (291/606) remote: Compressing objects: 49% (297/606) remote: Compressing objects: 50% (303/606) remote: Compressing objects: 51% (310/606) remote: Compressing objects: 52% (316/606) remote: Compressing objects: 53% (322/606) remote: Compressing objects: 54% (328/606) remote: Compressing objects: 55% (334/606) remote: Compressing objects: 56% (340/606) remote: Compressing objects: 57% (346/606) remote: Compressing objects: 58% (352/606) remote: Compressing objects: 59% (358/606) remote: Compressing objects: 60% (364/606) remote: Compressing objects: 61% (370/606) remote: Compressing objects: 62% (376/606) remote: Compressing objects: 63% (382/606) remote: Compressing objects: 64% (388/606) remote: Compressing objects: 65% (394/606) remote: Compressing objects: 66% (400/606) remote: Compressing objects: 67% (407/606) remote: Compressing objects: 68% (413/606) remote: Compressing objects: 69% (419/606) remote: Compressing objects: 70% (425/606) remote: Compressing objects: 71% (431/606) remote: Compressing objects: 72% (437/606) remote: Compressing objects: 73% (443/606) remote: Compressing objects: 74% (449/606) remote: Compressing objects: 75% (455/606) remote: Compressing objects: 76% (461/606) remote: Compressing objects: 77% (467/606) remote: Compressing objects: 78% (473/606) remote: Compressing objects: 79% (479/606) remote: Compressing objects: 80% (485/606) remote: Compressing objects: 81% (491/606) remote: Compressing objects: 82% (497/606) remote: Compressing objects: 83% (503/606) remote: Compressing objects: 84% (510/606) remote: Compressing objects: 85% (516/606) remote: Compressing objects: 86% (522/606) remote: Compressing objects: 87% (528/606) remote: Compressing objects: 88% (534/606) remote: Compressing objects: 89% (540/606) remote: Compressing objects: 90% (546/606) remote: Compressing objects: 91% (552/606) remote: Compressing objects: 92% (558/606) remote: Compressing objects: 93% (564/606) remote: Compressing objects: 94% (570/606) remote: Compressing objects: 95% (576/606) remote: Compressing objects: 96% (582/606) remote: Compressing objects: 97% (588/606) remote: Compressing objects: 98% (594/606) remote: Compressing objects: 99% (600/606) remote: Compressing objects: 100% (606/606) remote: Compressing objects: 100% (606/606), done. Receiving objects: 0% (1/632) Receiving objects: 1% (7/632) Receiving objects: 2% (13/632) Receiving objects: 3% (19/632) Receiving objects: 4% (26/632) Receiving objects: 5% (32/632) Receiving objects: 6% (38/632) Receiving objects: 7% (45/632) Receiving objects: 8% (51/632) Receiving objects: 9% (57/632) Receiving objects: 10% (64/632) Receiving objects: 11% (70/632) Receiving objects: 12% (76/632) Receiving objects: 13% (83/632) Receiving objects: 14% (89/632) Receiving objects: 15% (95/632) Receiving objects: 16% (102/632) Receiving objects: 17% (108/632) Receiving objects: 18% (114/632) Receiving objects: 19% (121/632) Receiving objects: 20% (127/632) Receiving objects: 21% (133/632) Receiving objects: 22% (140/632) Receiving objects: 23% (146/632) Receiving objects: 24% (152/632) Receiving objects: 25% (158/632) Receiving objects: 26% (165/632) Receiving objects: 27% (171/632) Receiving objects: 28% (177/632) Receiving objects: 29% (184/632) Receiving objects: 30% (190/632) Receiving objects: 31% (196/632) Receiving objects: 32% (203/632) Receiving objects: 33% (209/632) Receiving objects: 34% (215/632) Receiving objects: 35% (222/632) Receiving objects: 36% (228/632) Receiving objects: 37% (234/632) Receiving objects: 38% (241/632) Receiving objects: 39% (247/632) Receiving objects: 40% (253/632) Receiving objects: 41% (260/632) Receiving objects: 42% (266/632) Receiving objects: 43% (272/632) Receiving objects: 44% (279/632) Receiving objects: 45% (285/632) Receiving objects: 46% (291/632) Receiving objects: 47% (298/632) Receiving objects: 48% (304/632) Receiving objects: 49% (310/632) Receiving objects: 50% (316/632) Receiving objects: 51% (323/632) Receiving objects: 52% (329/632) Receiving objects: 53% (335/632) Receiving objects: 54% (342/632) Receiving objects: 55% (348/632) Receiving objects: 56% (354/632) Receiving objects: 57% (361/632) Receiving objects: 58% (367/632) Receiving objects: 59% (373/632) Receiving objects: 60% (380/632) Receiving objects: 61% (386/632) Receiving objects: 62% (392/632) Receiving objects: 63% (399/632) Receiving objects: 64% (405/632) Receiving objects: 65% (411/632) Receiving objects: 66% (418/632) Receiving objects: 67% (424/632) Receiving objects: 68% (430/632) Receiving objects: 69% (437/632) Receiving objects: 70% (443/632) Receiving objects: 71% (449/632) Receiving objects: 72% (456/632) Receiving objects: 73% (462/632) Receiving objects: 74% (468/632) Receiving objects: 75% (474/632) Receiving objects: 76% (481/632) Receiving objects: 77% (487/632) Receiving objects: 78% (493/632) Receiving objects: 79% (500/632) Receiving objects: 80% (506/632) Receiving objects: 81% (512/632) Receiving objects: 82% (519/632) Receiving objects: 83% (525/632) Receiving objects: 84% (531/632) Receiving objects: 85% (538/632) Receiving objects: 86% (544/632) Receiving objects: 87% (550/632) Receiving objects: 88% (557/632) remote: Total 632 (delta 120), reused 212 (delta 24), pack-reused 0 Receiving objects: 89% (563/632) Receiving objects: 90% (569/632) Receiving objects: 91% (576/632) Receiving objects: 92% (582/632) Receiving objects: 93% (588/632) Receiving objects: 94% (595/632) Receiving objects: 95% (601/632) Receiving objects: 96% (607/632) Receiving objects: 97% (614/632) Receiving objects: 98% (620/632) Receiving objects: 99% (626/632) Receiving objects: 100% (632/632) Receiving objects: 100% (632/632), 2.07 MiB | 4.92 MiB/s, done. Resolving deltas: 0% (0/120) Resolving deltas: 2% (3/120) Resolving deltas: 4% (5/120) Resolving deltas: 5% (6/120) Resolving deltas: 6% (8/120) Resolving deltas: 10% (12/120) Resolving deltas: 11% (14/120) Resolving deltas: 12% (15/120) Resolving deltas: 15% (19/120) Resolving deltas: 16% (20/120) Resolving deltas: 17% (21/120) Resolving deltas: 20% (25/120) Resolving deltas: 21% (26/120) Resolving deltas: 26% (32/120) Resolving deltas: 29% (35/120) Resolving deltas: 30% (36/120) Resolving deltas: 31% (38/120) Resolving deltas: 33% (40/120) Resolving deltas: 35% (42/120) Resolving deltas: 36% (44/120) Resolving deltas: 37% (45/120) Resolving deltas: 40% (49/120) Resolving deltas: 42% (51/120) Resolving deltas: 43% (52/120) Resolving deltas: 44% (53/120) Resolving deltas: 45% (54/120) Resolving deltas: 46% (56/120) Resolving deltas: 47% (57/120) Resolving deltas: 49% (59/120) Resolving deltas: 50% (60/120) Resolving deltas: 51% (62/120) Resolving deltas: 53% (64/120) Resolving deltas: 54% (65/120) Resolving deltas: 55% (66/120) Resolving deltas: 56% (68/120) Resolving deltas: 57% (69/120) Resolving deltas: 58% (70/120) Resolving deltas: 60% (72/120) Resolving deltas: 63% (76/120) Resolving deltas: 64% (77/120) Resolving deltas: 65% (79/120) Resolving deltas: 70% (84/120) Resolving deltas: 71% (86/120) Resolving deltas: 73% (88/120) Resolving deltas: 78% (94/120) Resolving deltas: 81% (98/120) Resolving deltas: 90% (108/120) Resolving deltas: 91% (110/120) Resolving deltas: 94% (113/120) Resolving deltas: 96% (116/120) Resolving deltas: 97% (117/120) Resolving deltas: 100% (120/120) Resolving deltas: 100% (120/120), done. Removing temporary output file BBMap. [Fri Jun 19 08:55:13 2020] Finished job 54. 2 of 55 steps (4%) done  [Fri Jun 19 08:55:13 2020] rule skewer: input: analysis/Aas-gDNA1-S2-PaE_S2_L001_1.fastq, analysis/Aas-gDNA1-S2-PaE_S2_L001_2.fastq, data/qc/adapters.fa output: analysis/trimmed/Aas-gDNA1-S2-PaE_S2_L001.trim_1.fastq.gz, analysis/trimmed/Aas-gDNA1-S2-PaE_S2_L001.trim_2.fastq.gz log: analysis/trimmed/Aas-gDNA1-S2-PaE_S2_L001-trimmed.status jobid: 44 wildcards: dir=analysis, fq=Aas-gDNA1-S2-PaE_S2_L001 threads: 12  [Fri Jun 19 08:55:51 2020] Finished job 7. 3 of 55 steps (5%) done [Fri Jun 19 08:55:55 2020] Finished job 18. 4 of 55 steps (7%) done [Fri Jun 19 08:56:14 2020] Error in rule download_ensembl_references:  jobid: 15  output: data/ensembl/Homo_sapiens.GRCh38.dna.primary_assembly.fa, data/ensembl/Homo_sapiens.GRCh38.82.gff3, data/ensembl/Homo_sapiens.GRCh38.pep.all.fa, data/ensembl/Homo_sapiens.vcf.gz, data/ensembl/Homo_sapiens.ensembl.vcf  log: data/ensembl/downloads.log  RuleException: CalledProcessError in line 14 of /app/rules/downloads.smk: Command ' set -euo pipefail; (python scripts/download_ensembl.py Homo_sapiens.GRCh38 && gunzip data/ensembl/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz data/ensembl/Homo_sapiens.GRCh38.82.gff3.gz data/ensembl/Homo_sapiens.GRCh38.pep.all.fa.gz && zcat data/ensembl/Homo_sapiens.vcf.gz | python scripts/clean_vcf.py > data/ensembl/Homo_sapiens.ensembl.vcf) 2> data/ensembl/downloads.log ' returned non-zero exit status 1. File "/app/rules/downloads.smk", line 14, in __rule_download_ensembl_references File "/usr/local/envs/spritz/lib/python3.6/concurrent/futures/thread.py", line 56, in run Trying to restart job 15.  [Fri Jun 19 08:56:14 2020] rule download_ensembl_references: output: data/ensembl/Homo_sapiens.GRCh38.dna.primary_assembly.fa, data/ensembl/Homo_sapiens.GRCh38.82.gff3, data/ensembl/Homo_sapiens.GRCh38.pep.all.fa, data/ensembl/Homo_sapiens.vcf.gz, data/ensembl/Homo_sapiens.ensembl.vcf log: data/ensembl/downloads.log jobid: 15 

acesnik commented 4 years ago

I'll take a look at that this week. I wasn't able to replicate the error on my side. Could you share the data/ensembl/downloads.log file with me? I'm also curious if deleting the data folder and rerunning Spritz would fix the issue, since one of those downloads might have been interrupted before.

animesh commented 4 years ago

Attached in the log you asked for :) I tried deleting and re-running the GUI/workflow and it is stuck with following message:

[image: image.png] Using default tag: latest latest: Pulling from smithlab/spritz Digest: sha256:ad7fa131134ebb5c0c9c60a3686b7621187781294299081bf28bbbfd297bf9f5 Status: Image is up to date for smithlab/spritz:latest docker.io/smithlab/spritz:latest

On Mon, Jun 22, 2020 at 4:52 AM Anthony notifications@github.com wrote:

I'll take a look at that this week. I wasn't able to replicate the error on my side. Could you share the data/ensembl/downloads.log file with me? I'm also curious if deleting the data folder and rerunning Spritz would fix the issue, since one of those downloads might have been interrupted before.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/smith-chem-wisc/Spritz/issues/158#issuecomment-647239539, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAFHUJTQLRFZXJ7VO7UFB3RX3BODANCNFSM4JD52IQQ .

acesnik commented 4 years ago

I can't see the attachment, unfortunately; would you mind dropping it into a GitHub issue comment? I think that works better for attachments...

The message you posted looks like the Powershell popup when it is working correctly, so that's a good sign. Did it look like it was working in the workflow output?

animesh commented 4 years ago

Looking at the contents of the F:\docker\data\ensembl\downloads.log (pasted below) and using whatever little brain i got, the issue seems to be ftp port as even IE is not able to reach it: image and this seems to be at the router level which i have no control on, so wondering if you/let me know where to change the ftp:// to HTTP:// as that seems to work image and create the GUI binary and then i gave a give a shot

--2020-06-22 07:50:29--  ftp://ftp.ensembl.org/pub/release-82//fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.toplevel.fa.gz
           => 'data/ensembl/Homo_sapiens.GRCh38.dna.toplevel.fa.gz'
Resolving ftp.ensembl.org (ftp.ensembl.org)... 193.62.197.76, 193.62.197.76, 193.62.197.76, ...
Connecting to ftp.ensembl.org (ftp.ensembl.org)|193.62.197.76|:21... failed: Connection refused.
Connecting to ftp.ensembl.org (ftp.ensembl.org)|193.62.197.76|:21... failed: Connection refused.
Connecting to ftp.ensembl.org (ftp.ensembl.org)|193.62.197.76|:21... failed: Connection refused.
Connecting to ftp.ensembl.org (ftp.ensembl.org)|193.62.197.76|:21... failed: Connection refused.
Traceback (most recent call last):
  File "scripts/download_ensembl.py", line 24, in <module>
    subprocess.check_call(["wget", "-P", "data/ensembl/", toplevel])
  File "/usr/local/envs/spritz/lib/python3.6/subprocess.py", line 311, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['wget', '-P', 'data/ensembl/', 'ftp://ftp.ensembl.org/pub/release-82//fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.toplevel.fa.gz']' returned non-zero exit status 4.
acesnik commented 4 years ago

Oh, strange! Thank you for posting the error messages. Both FTP and HTTP work here, so I'll change the default to HTTP. Capture

acesnik commented 4 years ago

It should be ready to try again (ftp:\ was changed to http:\ here https://github.com/smith-chem-wisc/Spritz/pull/177). I'd recommend deleting the data folder again. Spritz should download the new Docker image like before.

animesh commented 4 years ago

Thanks for the quickfix! I am (re)running the workflow and looks like download is going just fine 👍 downloads - Copy.txt, is there a better way to check the current status of the pipeline, currently i am just tracking the taskmanager:

image

acesnik commented 4 years ago

The way I watch the progress is to watch the workflow_*.txt file in Notepad++ (https://notepad-plus-plus.org/downloads/). That way, you can reload it when changes have been made, i.e. it has started a new step.

image

acesnik commented 4 years ago

The final database that can be used in MetaMorpheus to find variant peptides/proteoforms is combined.spritz.snpeff.protein.withmods.xml.gz.

animesh commented 4 years ago

Seems like the combined.spritz.snpeff.protein.withmods.xml.gz is nowhere to be found in the docker folder? I must note that the docker window shell is still not done it seems though: image but i don't see anything changed in the workflow_2020-06-26-11-17-15.txt since yesterday, so probably it is stuck? Any ideas on how to figure out the issue here?

acesnik commented 4 years ago

It looks like it's failing on the step following the downloads, so it may be that the genome file didn't download properly. Could you check the sha256sum of the genome fasta to see if that's the case?

This is what I see:

$sha256sum Homo_sapiens.GRCh38.dna.primary_assembly.fa
78777b0886e8dfa5e14e4957fbbaa53736fcbaa5668d59e09b6b7945fca93d8c  Homo_sapiens.GRCh38.dna.primary_assembly.fa

Regarding the powershell window looking stuck, I fixed that in a new release, which gets away from opening the Powershell window to hopefully give better feedback on what processes are running: https://github.com/smith-chem-wisc/Spritz/releases/tag/0.1.3.

If it still doesn't work with that new version, I'd be happy to have a video call sometime this week to try to figure it out. I'm in the CDT (UTC-5) time zone; feel free to suggest a time.

animesh commented 4 years ago

shasum looks same:

(base) animeshs@DMED7596:~$ sha256sum docker/data/ensembl/Homo_sapiens.GRCh38.dna.primary_assembly.fa
78777b0886e8dfa5e14e4957fbbaa53736fcbaa5668d59e09b6b7945fca93d8c  docker/data/ensembl/Homo_sapiens.GRCh38.dna.primary_assembly.fa

i tried the 0.1.3 BTW, it is crashing with no popups it seems, attached log workflow_2020-06-29-09-33-43.txt should i rerun docker itself or reboot windows itself, since windows work in mysterious ways :P

v-call sounds cool, i can make it on any of this weekdays between 8am-11am your time 👍

acesnik commented 4 years ago

Thank you for giving those two things a try. That's really strange that it's crashing without popups, which I haven't seen before, and I've never seen the error before at the end of the log. I think a video call might be helpful. How about tomorrow at 9:30 AM CDT?

acesnik commented 4 years ago

I don't know if rebooting will be helpful, but it might be worth a try.

animesh commented 4 years ago

Yes i am planning on doing that once my running processes are over 👍 will try again and get back in case i fail as usual ...

acesnik commented 4 years ago

Sorry for the delay. I had some technical difficulties. Do you have an email where I can send the video chat link?

animesh commented 4 years ago

No worries but i will first retry running the system after a reboot, hopefully sometime this week and then we chat?

acesnik commented 4 years ago

Sure, sounds good!

acesnik commented 4 years ago

You may also want to try increasing the resources allocated within Docker Desktop. We've run into issues caused by that recently.

Memory should be above 16 GB, Disk image size should be above 80 GB, and I have Swap set to 2 GB.

animesh commented 4 years ago

Thanks for the tip, it was quite low on resources considering your suggestions!

image

Will upgrade and check after the promised re-boot 👍

animesh commented 4 years ago

Looks like the reboot took it few steps further 👍 image Some errors mentioned in attached workflow_2020-07-06-12-51-12.txt workflow_2020-07-06-12-51-12.txt

base) animeshs@DMED7596:~$ grep -i "error" /mnt/f/docker/workflow_2020-07-06-12-51-12.txt
Error in rule download_snpeff:
alledProcessError in line 11 of /app/rules/variants.smk:
Error in rule filter_transcripts_add_cds:
alledProcessError in line 40 of /app/rules/isoforms.smk:
Error in rule filter_transcripts_add_cds:
alledProcessError in line 40 of /app/rules/isoforms.smk:
Error in rule filter_transcripts_add_cds:
alledProcessError in line 40 of /app/rules/isoforms.smk:

xiting because a job execution failed. Look above for error message

Any ideas how to proceed would be great to know :)

acesnik commented 4 years ago

Good to hear it made it further down the line! It looks like the isoform pipeline (which is in beta) didn't find anything, and the reconstructed gene model is empty, so the subsequent tools failed. In terms of troubleshooting, could you confirm that these are human samples? Do the BAM files look like they have data, i.e. not being very small files?

Please restart the pipeline from where you left off. I just built a new version of the pipeline that will download automatically that only performs the variant pipeline discussed in the paper. I'll be releasing a new version in the near future that handles this error more gracefully.

animesh commented 4 years ago

Thanks for taking time out to reply and yes you are right, these are not really human samples. I was expecting to see some human contamination in a variant form, so i guess there are none, which is actually a piece of good news for the project 👍