hwanglab / divine

Divine: Prioritizing Genes for Rare Mendelian Disease in Whole Exome Sequencing Data
12 stars 1 forks source link

Issues running tutorial examples #2

Closed holtjma closed 6 years ago

holtjma commented 6 years ago

Hello,

I was testing the tutorial datasets (to verify that my installation works). The first one (HPO only) worked fine, but the second one (VCF only, Pfeiffer) failed with the following output:

root@a13a700f319d:/divine/gcn/bin/prioritize/examples# divine.py -v ./Pfeiffer.vcf -o ./Pfeiffer_noHpo
############################################
2018-08-31 20:36:14.181851 [PART:<module>] Divine (v0.1.2) is running on [HPO:None ,VCF:./Pfeiffer.vcf]

############################################
2018-08-31 20:36:14.181916 [INFO:Divine] initializing Divine ...

2018-08-31 20:36:14.181991 [INFO:_set_args] storing input condition ...

2018-08-31 20:36:14.182265 [INFO:_set_args] prepared log directory[./Pfeiffer_noHpo/logs]  ...

2018-08-31 20:36:14.182914 [INFO:<module>] reading configuration file [_read_config;/divine/gcn/config/divine.conf] ...

2018-08-31 20:36:14.184094 [INFO:<module>] done. [_read_config]

2018-08-31 20:36:14.184240 [INFO:<module>] capturing user command line [record_commandline] ...

2018-08-31 20:36:14.184837 [INFO:<module>] done. [record_commandline]

2018-08-31 20:36:14.184940 [INFO:<module>] <divine> initialization completed [_set_args]

opening VCF [./Pfeiffer.vcf] and parse heads ...
predicting gender from the sample [./Pfeiffer.vcf]
gender identified [1], Done.
Sample ID [manuel] is identified for a proband analysis!
setting up damaging factors ...
2018-08-31 20:36:14.521744 [INFO:<module>] VCF is going to be masked by RefGene coding region

2018-08-31 20:36:14.521898 [INFO:<module>] done. initialization

2018-08-31 20:36:14.522384 [INFO:<module>] analyzing variants on [./Pfeiffer.vcf] ...

2018-08-31 20:36:14.522525 [INFO:<module>] annotating VCF file[vannotate;./Pfeiffer.vcf] ...

Traceback (most recent call last):
  File "/divine/gcn/bin/prioritize/divine.py", line 1414, in <module>
    main()
  File "/divine/gcn/bin/prioritize/divine.py", line 1368, in main
    dv.vannotate(args.reuse)
  File "/divine/gcn/bin/prioritize/divine.py", line 449, in vannotate
    cRef = annotateRegion.RefGeneUcscTB(work_dir=self.out_dir,logger=self.logger)
  File "/divine/gcn/lib/varann/vartype/varant/annotateRegion.py", line 25, in __init__
    raise IOError('refseq file [%s] not exist!'%self.refGene_fn)
IOError: refseq file [/divine/gcndata/refgene/refseq.txt] not exist!

I installed this using the Docker's python:2 image following the exact commands provided in the tutorial. Is there an additional step not listed to acquire the refseq.txt file? I found no mention of it in the tutorial.

cjhong commented 6 years ago

I forgot to include some data files in the release which is necessary to run Divine. The svn update as of 9/12/2018 resolves this issue.

For those of you who installed Divine previously, try this 1) update your local copy with the latest commit 2) cd $DIVINE 3) python ./setup.py --install --update_db 4) cd gcn/bin/prioritize/examples 5) ./runme_pfeisffer_noHpo.sh

holtjma commented 6 years ago

I did a fresh install (again from Docker's python:2 image) and now neither the Angelman's or the Pfeisffer examples are working. I'm not sure why, but it is a different error this time.

EDIT: I checked the gcndata/hpo path and nothing is there. On an old install (non-Docker) there appears to be five files that weren't downloaded with this fresh install. Did something get removed from the install script that downloaded that data?

The exact commands, and I didn't notice any errors from the installation/downloads:

[masked]$ docker run -it --rm python:2 bash
root@f6ab9cf08507:/# git clone https://github.com/hwanglab/divine.git
root@f6ab9cf08507:/# cd divine
root@f6ab9cf08507:/divine# python ./setup.py --install --update_db
root@f6ab9cf08507:/divine# export DIVINE=/divine
root@f6ab9cf08507:/divine# export PATH=$HOME/.local/bin:$PATH
root@f6ab9cf08507:/divine# export PATH=$DIVINE/gcn/bin/prioritize:$PATH
root@f6ab9cf08507:/divine# export PYTHONPATH=$DIVINE
root@f6ab9cf08507:/divine# export PYTHONPATH=$DIVINE/python_libs/lib/python2.7/site-packages:$PYTHONPATH
root@f6ab9cf08507:/divine# 
root@f6ab9cf08507:/divine# cd gcn/bin/prioritize/examples

Angelman's output:

root@f6ab9cf08507:/divine/gcn/bin/prioritize/examples# ./runme_angelman.sh 
../divine.py -q ./Angelman_Syndrome.hpo -o ./Angelman_Syndrome
############################################
2018-09-12 14:14:46.671276 [PART:<module>] Divine (v0.1.2) is running on [HPO:./Angelman_Syndrome.hpo ,VCF:None]

############################################
2018-09-12 14:14:46.671348 [INFO:Divine] initializing Divine ...

2018-09-12 14:14:46.671389 [INFO:_set_args] storing input condition ...

2018-09-12 14:14:46.671754 [INFO:_set_args] prepared log directory[./Angelman_Syndrome/logs]  ...

2018-09-12 14:14:46.672228 [INFO:<module>] reading configuration file [_read_config;/divine/gcn/config/divine.conf] ...

Traceback (most recent call last):
  File "../divine.py", line 1416, in <module>
    main()
  File "../divine.py", line 1344, in main
    dv = Divine(args)
  File "../divine.py", line 77, in __init__
    self._set_args(uargs)
  File "../divine.py", line 213, in _set_args
    self._read_config(uargs.vcf_filter_cfg)
  File "../divine.py", line 267, in _read_config
    self._set_config('database', 'ext_disease_to_gene')
  File "../divine.py", line 241, in _set_config
    raise IOError('check if [%s] exists in %s[%s]' % (entry, self.config_fn, section))
IOError: check if [ext_disease_to_gene] exists in /divine/gcn/config/divine.conf[database]

Pfeisffer output:

root@f6ab9cf08507:/divine/gcn/bin/prioritize/examples# ./runme_pfeisffer_noHpo.sh 
../divine.py -v ./Pfeiffer.vcf -o ./Pfeiffer_noHpo -e 1 -c ../../../config/filterconf_dp10.txt --reuse -k 0
############################################
2018-09-12 14:14:15.608239 [PART:<module>] Divine (v0.1.2) is running on [HPO:None ,VCF:./Pfeiffer.vcf]

############################################
2018-09-12 14:14:15.608340 [INFO:Divine] initializing Divine ...

2018-09-12 14:14:15.608410 [INFO:_set_args] storing input condition ...

2018-09-12 14:14:15.608495 [INFO:_set_args] prepared log directory[./Pfeiffer_noHpo/logs]  ...

2018-09-12 14:14:15.608942 [INFO:<module>] reading configuration file [_read_config;/divine/gcn/config/divine.conf] ...

Traceback (most recent call last):
  File "../divine.py", line 1416, in <module>
    main()
  File "../divine.py", line 1344, in main
    dv = Divine(args)
  File "../divine.py", line 77, in __init__
    self._set_args(uargs)
  File "../divine.py", line 213, in _set_args
    self._read_config(uargs.vcf_filter_cfg)
  File "../divine.py", line 267, in _read_config
    self._set_config('database', 'ext_disease_to_gene')
  File "../divine.py", line 241, in _set_config
    raise IOError('check if [%s] exists in %s[%s]' % (entry, self.config_fn, section))
IOError: check if [ext_disease_to_gene] exists in /divine/gcn/config/divine.conf[database]
cjhong commented 6 years ago

Can you try this?

cd $DIVINE python ./setup.py --install

holtjma commented 6 years ago

Ah, I didn't realize I needed to run without the update for the first one. That worked, both examples are running to completion now!