hwanglab / divine

Divine: Prioritizing Genes for Rare Mendelian Disease in Whole Exome Sequencing Data
12 stars 1 forks source link

configuration file requirments #6

Open hjafar opened 4 years ago

hjafar commented 4 years ago

I have install the Divine tool (https://github.com/hwanglab/divine/blob/master/documents/tutorial/divine_tutorial.md) but when I run the following command, I have got the error as shown below. $ divine.py -v ./Pfeiffer.vcf -o ./Pfeiffer.noHpo

IOError: check [beta_fit = /home/genomic-lab/Documents/kfmc/tools/divine/gcndata/snv_training/clin_tr.dill] in the file [/home/genomic-lab/Documents/kfmc/tools/divine/gcn/config/divine.conf]

It is showing related to configuration file that needs the following files in database part.

[database] ext_disease_to_gene = gcndata/hpo/ALL_SOURCES_disgnet_TYPICAL_FEATURES_diseases_to_genes_to_phenotypes.txt beta_fit = gcndata/snv_training/clin_tr.dill esp_to_gene = gcndata/stringDB/est2geneSymbol_20160306.tsv

What I should do to fix the issue.

Best regards, Hussain

cjhong commented 4 years ago

Can you make sure whether the file (/home/genomic-lab/Documents/kfmc/tools/divine/gcndata/snv_training/clin_tr.dill) exists?

hjafar commented 4 years ago

Thanks for the fast replay. It is not exists in the divine folder (snv_training). Also the rest files doesn't exists how can I get them?

hjafar commented 4 years ago

I am still waiting to receiving a response from you regarding list files databases above. Thank you

cjhong commented 4 years ago

Sorry for the late reply. Have you followed the installation instructions? You need to access AWS files to install the resource files.

On Tue, Jan 28, 2020 at 5:20 AM hjafar notifications@github.com wrote:

I am still waiting to receiving a response from you regarding list files databases above. Thank you

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/hwanglab/divine/issues/6?email_source=notifications&email_token=AEXOWOTZPARM7VC3JKAPX4TRAABHNA5CNFSM4KJQFOKKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKCYNCA#issuecomment-579176072, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEXOWOQHX5W6UOGCXSQKOZ3RAABHNANCNFSM4KJQFOKA .

hjafar commented 4 years ago

I really appropriate for your reply. I was trying to find another tool to similar to yours but I could not find to answer the biological question with WES data of complex inheritance diseases. I still find difficultly to find and install all files in the config file... I did not find the "AWS" file to install the resource files. Note: the tool is installed correctly when I run $ divine.py --help but some files are missing in the divine.conf file (i.e. [database] ext_disease_to_gene =, beta_fit = and esp_to_gene = ) when I run the command line divine.py -v ./Pfeiffer.vcf -o ./Pfeiffer.noHpo THNK

cjhong commented 4 years ago

Can you download the file at the link and try it?

https://drive.google.com/open?id=1_n9UFzKesRqyJqN5IVVO1LQ4HDaxl-Ad

Thank you,

hjafar commented 4 years ago

Good day... Thank you for the quick response and for sending the request files.

I have run the following command line but I have got some issues as shown below. Could you please have look at and tell me how it can solve it.

$ divine.py -v '/home/genomic-lab/Documents/kfmc/tools/divine/gcn/bin/prioritize/examples/Pfeiffer.vcf' -o ./Pfeiffer.noHpo ############################################ 2020-03-29 22:52:31.222536 [PART:] Divine (v0.1.2) is running on [HPO:None ,VCF:/home/genomic-lab/Documents/kfmc/tools/divine/gcn/bin/prioritize/examples/Pfeiffer.vcf]

############################################ 2020-03-29 22:52:31.222572 [INFO:Divine] initializing Divine ...

2020-03-29 22:52:31.222595 [INFO:_set_args] storing input condition ...

2020-03-29 22:52:31.222614 [INFO:_set_args] prepared log directory[./Pfeiffer.noHpo/logs] ...

2020-03-29 22:52:31.222798 [INFO:] reading configuration file [_read_config;/home/genomic-lab/Documents/kfmc/tools/divine/gcn/config/divine.conf] ...

2020-03-29 22:52:31.223093 [INFO:] done. [_read_config]

2020-03-29 22:52:31.223143 [INFO:] capturing user command line [record_commandline] ...

2020-03-29 22:52:31.223315 [INFO:] done. [record_commandline]

2020-03-29 22:52:31.223345 [INFO:] initialization completed [_set_args]

opening VCF [/home/genomic-lab/Documents/kfmc/tools/divine/gcn/bin/prioritize/examples/Pfeiffer.vcf] and parse heads ... predicting gender from the sample [/home/genomic-lab/Documents/kfmc/tools/divine/gcn/bin/prioritize/examples/Pfeiffer.vcf] gender identified [1], Done. Sample ID [manuel] is identified for a proband analysis! setting up damaging factors ... 2020-03-29 22:52:31.458824 [INFO:] VCF is going to be masked by RefGene coding region

2020-03-29 22:52:31.458900 [INFO:] done. initialization

2020-03-29 22:52:31.458922 [INFO:] analyzing variants on [/home/genomic-lab/Documents/kfmc/tools/divine/gcn/bin/prioritize/examples/Pfeiffer.vcf] ...

2020-03-29 22:52:31.458959 [INFO:] annotating VCF file[vannotate;/home/genomic-lab/Documents/kfmc/tools/divine/gcn/bin/prioritize/examples/Pfeiffer.vcf] ...

2020-03-29 22:52:31.459043 [INFO:] creating a bed file[./Pfeiffer.noHpo/refGene_e20_so_merged.bed] containing RefGene coding region (cmpl/incmpl/unk) @ RefGeneUcscTB.create_bed

2020-03-29 22:52:32.302593 [INFO:] sorting bed file ... @ RefGeneUcscTB.create_bed

2020-03-29 22:52:33.115626 [INFO:] merging exon coordinates overlapped each other... @ RefGeneUcscTB.create_bed

2020-03-29 22:52:34.198943 [INFO:] done. @ RefGeneUcscTB.create_bed

2020-03-29 22:52:34.202829 [INFO:] extracting variants in coding region from [/home/genomic-lab/Documents/kfmc/tools/divine/gcn/bin/prioritize/examples/Pfeiffer.vcf] @ vannotate ...

2020-03-29 22:52:34.203170 [INFO:] storing coding region boundaries from [./Pfeiffer.noHpo/refGene_e20_so_merged.bed] @ RefGeneUcscTB.get_boundary

2020-03-29 22:52:37.378357 [INFO:] done. @ RefGeneUcscTB.get_boundary

2020-03-29 22:52:37.378608 [INFO:] masking the vcf file [/home/genomic-lab/Documents/kfmc/tools/divine/gcn/bin/prioritize/examples/Pfeiffer.vcf] by the bed file [./Pfeiffer.noHpo/refGene_e20_so_merged.bed] @ BedMaskingVCF.run

2020-03-29 22:52:38.193988 [INFO:] done. @ BedMaskingVCF.run

2020-03-29 22:52:38.194143 [INFO:] done.@ vannotate

2020-03-29 22:52:38.194266 [INFO:] python /home/genomic-lab/Documents/kfmc/tools/divine/gcn/bin/annotpipe.py -i ./Pfeiffer.noHpo/refgene_e20.vcf -o ./Pfeiffer.noHpo/divine.vcf -l ./Pfeiffer.noHpo/logs

retcode: 1 Traceback (most recent call last): File "/home/genomic-lab/Documents/kfmc/tools/divine/gcn/bin/prioritize/divine.py", line 1415, in main() File "/home/genomic-lab/Documents/kfmc/tools/divine/gcn/bin/prioritize/divine.py", line 1369, in main dv.vannotate(args.reuse) File "/home/genomic-lab/Documents/kfmc/tools/divine/gcn/bin/prioritize/divine.py", line 481, in vannotate lib_utils.runcmd2(cmd,self.log_dir,self.logger,job_name) File "/home/genomic-lab/Documents/kfmc/tools/divine/gcn/lib/utils/lib_utils.py", line 208, in runcmd2 raise RuntimeError('[%s] failed' % cmd_str) RuntimeError: [python /home/genomic-lab/Documents/kfmc/tools/divine/gcn/bin/annotpipe.py -i ./Pfeiffer.noHpo/refgene_e20.vcf -o ./Pfeiffer.noHpo/divine.vcf -l ./Pfeiffer.noHpo/logs] failed


When I have run the annotpipe.py script I have got the follwoing massage.

$ '/home/genomic-lab/Documents/kfmc/tools/divine/gcn/bin/annotpipe.py' --help /home/genomic-lab/Documents/kfmc/tools/divine/gcn/bin/annotpipe.py: line 11: $'\n.. module:: annotpipe\n\t:platform: Unix, Windows, MacOSX\n\t:synopsis: A wraper to call VARANT\n\n.. moduleauthor:: Kunal Kundu (kunal.kundu@tcs.com); modified by changjin.hong@gmail.com\n\nThis modules is a wrapper to call VARANT. The inputs are -\n1. Unannotated VCF file path\n2. Path to create annotated vcf file (Option)\n': command not found

I look forward to hearing from you soon.

Thank you so much in advance...

cjhong commented 4 years ago

Run this and send log files in the "./Pfeiffer.noHpo/logs" to me.

python /home/genomic-lab/Documents/kfmc/tools/divine/gcn/bin/annotpipe.py -i ./Pfeiffer.noHpo/refgene_e20.vcf -o ./Pfeiffer.noHpo/divine.vcf -l ./Pfeiffer.noHpo/logs

hjafar commented 4 years ago

Here's the the output after run the above command line:

$ python /home/genomic-lab/Documents/kfmc/tools/divine/gcn/bin/annotpipe.py -i '/home/genomic-lab/Documents/kfmc/tools/divine/Pfeiffer.noHpo/refgene_e20.vcf' -o ./Pfeiffer.noHpo/divine.vcf -l ./Pfeiffer.noHpo/logs Annotation on [/home/genomic-lab/Documents/kfmc/tools/divine/Pfeiffer.noHpo/refgene_e20.vcf] in progress. Be patient (30 min+) ... Traceback (most recent call last): File "/home/genomic-lab/Documents/kfmc/tools/divine/gcn/bin/annotpipe.py", line 118, in main() File "/home/genomic-lab/Documents/kfmc/tools/divine/gcn/bin/annotpipe.py", line 115, in main ap.annotate_varant() File "/home/genomic-lab/Documents/kfmc/tools/divine/gcn/bin/annotpipe.py", line 83, in annotate_varant annotator.main(self.invcf, self.outvcf, self.logger, options) File "/home/genomic-lab/Documents/kfmc/tools/divine/gcn/lib/varann/vartype/varant/annotator.py", line 1369, in main snpa = SNPAnnotation(options) File "/home/genomic-lab/Documents/kfmc/tools/divine/gcn/lib/varann/vartype/varant/annotator.py", line 85, in init self.omim = self.mimdb.load_omim() File "/home/genomic-lab/Documents/kfmc/tools/divine/gcn/lib/databases/omim.py", line 44, in load_omim results = self.execute(stmt).fetchall() File "/home/genomic-lab/Documents/kfmc/tools/divine/gcn/lib/io/db.py", line 243, in execute return cursor.execute(stmt) sqlite3.OperationalError: no such table: mimdis


The following the output of /Pfeiffer.noHpo/logs file. Also, I have attaced the file.

2020-03-30 15:36:08,684 - gcn - INFO - Environment variable check successful.. 2020-03-30 15:36:08,684 - gcn - INFO - Input file = /home/genomic-lab/Documents/kfmc/tools/divine/Pfeiffer.noHpo/refgene_e20.vcf, Output file = ./Pfeiffer.noHpo/divine.vcf varant_20200330_1536.log

cjhong commented 4 years ago

Can you make sure that you set up a database all properly? 1. My setup for gcndb looks like ??@??:divine$ ls gcndb clinvardb
clnphesnpdb
dbsnp
exac
interpro
mimdb
nsfpdb
refmrna
splicedb clinvitae
cosmic
esp
geneontology
kgdb
mirna
refgene
regulomedb
utrdb

  1. Or, did you add all shell environment configuration after installation?

  2. Let us communicate via email (changjin.hong@gmail.com) directly?

  3. The other option is to use exomiser which is also available for WES germline annotation.

hjafar commented 4 years ago

Thank you so much. please check your email.