choderalab / ensembler

Automated omics-scale protein modeling and simulation setup.
http://ensembler.readthedocs.io/
GNU General Public License v2.0
52 stars 21 forks source link

ensembler installation #47

Closed mihirdate closed 9 years ago

mihirdate commented 9 years ago

Hi guys, I am trying to install ensembler. We installed anaconda as one of the modules on or system and then followed ensembler installation steps using conda config. Now, ensembler --help displays help menu of ensembler. But when I tried running ensembler example as follows,

$ensembler quickmodel --target_uniprot_entry_name EGFR_HUMAN --uniprot_domain_regex '^Protein kinase' --template_pdbids 4KB8 --no-loopmodel

Following is the output I get.

output

Traceback (most recent call last): File "/opt/az/local/anaconda/2.3.0/installdir/bin/ensembler", line 6, in sys.exit(main()) File "/opt/az/local/anaconda/2.3.0/installdir/lib/python2.7/site-packages/ensembler/cli.py", line 40, in main command.dispatch(args) File "/opt/az/local/anaconda/2.3.0/installdir/lib/python2.7/site-packages/ensembler/cli_commands/quickmodel.py", line 106, in dispatch QuickModel(targetid=args['--targetid'], templateids=templateids, target_uniprot_entry_name=args['--target_uniprot_entry_name'], uniprot_domain_regex=args['--uniprot_domain_regex'], pdbids=pdbids, chainids=chainids_dict, template_uniprot_query=args['--template_uniprot_query'], template_seqid_cutoff=template_seqid_cutoff, loopmodel=not args['--no-loopmodel'], package_for_fah=args['--package_for_fah'], nfahclones=nfahclones, structure_dirs=structure_paths) File "/opt/az/local/anaconda/2.3.0/installdir/lib/python2.7/site-packages/ensembler/tools/quick_model.py", line 58, in init gather_targets_obj = ensembler.initproject.GatherTargetsFromUniProt(uniprot_query_string, uniprot_domain_regex=self.uniprot_domain_regex) File "/opt/az/local/anaconda/2.3.0/installdir/lib/python2.7/site-packages/ensembler/initproject.py", line 181, in init super(GatherTargetsFromUniProt, self).init() File "/opt/az/local/anaconda/2.3.0/installdir/lib/python2.7/site-packages/ensembler/initproject.py", line 90, in init self.manual_overrides = ensembler.core.ManualOverrides() File "/opt/az/local/anaconda/2.3.0/installdir/lib/python2.7/site-packages/ensembler/core.py", line 257, in init self.target = TargetManualOverrides(manual_overrides_yaml) File "/opt/az/local/anaconda/2.3.0/installdir/lib/python2.7/site-packages/ensembler/core.py", line 275, in init target_dict = manual_overrides_yaml.get('target-selection') AttributeError: 'NoneType' object has no attribute 'get'

Can anyone tell if this is installation issue or usage?

Best, Mihir

danielparton commented 9 years ago

Hi Mihir, this is a bug in Ensembler - sorry about that. I have reproduced the error and am working on a fix. I'll release an updated version of Ensembler once that is done.

mihirdate commented 9 years ago

Hi Dan, Thanks for the reply. Let me know when its fixed and I will try it again.

danielparton commented 9 years ago

Hi Mihir, the new Ensembler release is now available. Please run conda update ensembler and check that the text output indicates that version 1.0.5 has been installed. This should fix the problem you had encountered.

I have also slightly changed the usage example in the documentation: http://ensembler.readthedocs.org/en/latest/examples.html One of the template_pdbids is different. This is just to reduce the number of models generated, as that is meant to be a very quick example. It should now produce two models - one for each template.

Thanks for reporting this bug! Let me know if there are further issues.

Best, Danny

mihirdate commented 9 years ago

Thanks Danny. I did update to the latest version. Again install is complete and it produces output on $ensembler --help Even test works fine. (test was failing yesterday) $ nosetests ensembler -a unit

...............................

Ran 30 tests in 514.959s

OK

But on running example, this is what I get

$ensembler quickmodel --target_uniprot_entry_name EGFR_HUMAN --uniprot_domain_regex '^Protein kinase' --template_pdbids 1M14,4AF3 --no-loopmodel Querying UniProt web server... Traceback (most recent call last): File "/opt/az/local/anaconda/2.3.0/installdir/bin/ensembler", line 6, in sys.exit(main()) File "/opt/az/local/anaconda/2.3.0/installdir/lib/python2.7/site-packages/ensembler/cli.py", line 40, in main command.dispatch(args) File "/opt/az/local/anaconda/2.3.0/installdir/lib/python2.7/site-packages/ensembler/cli_commands/quickmodel.py", line 106, in dispatch QuickModel(targetid=args['--targetid'], templateids=templateids, target_uniprot_entry_name=args['--target_uniprot_entry_name'], uniprot_domain_regex=args['--uniprot_domain_regex'], pdbids=pdbids, chainids=chainids_dict, template_uniprot_query=args['--template_uniprot_query'], template_seqid_cutoff=template_seqid_cutoff, loopmodel=not args['--no-loopmodel'], package_for_fah=args['--package_for_fah'], nfahclones=nfahclones, structure_dirs=structure_paths) File "/opt/az/local/anaconda/2.3.0/installdir/lib/python2.7/site-packages/ensembler/tools/quick_model.py", line 64, in init uniprot_query_string, uniprot_domain_regex=self.uniprot_domain_regex File "/opt/az/local/anaconda/2.3.0/installdir/lib/python2.7/site-packages/ensembler/initproject.py", line 186, in init self._gather_targets() File "/opt/az/local/anaconda/2.3.0/installdir/lib/python2.7/site-packages/ensembler/utils.py", line 37, in print_done fn(_args, _kwargs) File "/opt/az/local/anaconda/2.3.0/installdir/lib/python2.7/site-packages/ensembler/initproject.py", line 196, in _gather_targets self.uniprotxml = ensembler.uniprot.get_uniprot_xml(self.uniprot_query_string, _get_uniprot_xml_args) File "/opt/az/local/anaconda/2.3.0/installdir/lib/python2.7/site-packages/ensembler/uniprot.py", line 35, in get_uniprot_xml uniprotxmlstring = query_uniprot(uniprot_query_string) File "/opt/az/local/anaconda/2.3.0/installdir/lib/python2.7/site-packages/ensembler/uniprot.py", line 23, in query_uniprot response = urlopen(query_url) File "/opt/az/local/anaconda/2.3.0/installdir/lib/python2.7/urllib2.py", line 154, in urlopen return opener.open(url, data, timeout) File "/opt/az/local/anaconda/2.3.0/installdir/lib/python2.7/urllib2.py", line 431, in open response = self._open(req, data) File "/opt/az/local/anaconda/2.3.0/installdir/lib/python2.7/urllib2.py", line 449, in _open '_open', req) File "/opt/az/local/anaconda/2.3.0/installdir/lib/python2.7/urllib2.py", line 409, in _call_chain result = func(_args) File "/opt/az/local/anaconda/2.3.0/installdir/lib/python2.7/urllib2.py", line 1227, in http_open return self.do_open(httplib.HTTPConnection, req) File "/opt/az/local/anaconda/2.3.0/installdir/lib/python2.7/urllib2.py", line 1197, in do_open raise URLError(err) urllib2.URLError: <urlopen error [Errno 111] Connection refused>

Thanks for your help.

jchodera commented 9 years ago

This error looks like it can't get to the RCSB or UniProt via HTTP.

Do you guys have some sort of proxy you need to use to access standard websites like the RCSB or UniProt?

Have you tried this from, say, the MSKCC hal cluster?

mihirdate commented 9 years ago

Thanks John and Danny, You are right. I tried it on MSKCC hal cluster and it produced output. It eve generated model but I guess failed at the end in running short MD on GPU. Can I attach output in the form of a txt file on this issue tracker? I will check with my sys-admin and make sure about firewall stuff on our end.

jchodera commented 9 years ago

Go ahead and email me the output, unless you can excerpt the relevant bit of it here using GitHub Markdown.

Were you sure to run this through an interactive session

qsub -I -l walltime=04:00:00,nodes=1:ppn=1:gpus=1:shared -l mem=4G -q active

or the batch queue? Compute-intensive tasks should not be run on the head node, and it doesn't have any GPUs.

mihirdate commented 9 years ago

We did proxy settings on our GPU (GTX780) workstation machine and ensembler example successfully completed within 15-20 min. So we have ensembler working.

I tried submitting jobs on MSKCC hal cluster with

qsub submit-GPU.sh -I -l walltime=04:00:00,nodes=1:ppn=1:gpus=1:shared -l mem=4G -q active

The job remained on running mode for 4 hrs and then at the end did not produce any output. I think it only locked a GPU and did not actually run.

On the side note, using ensembler if I want to only model missing parts of the protein crystal structure I have, what are my options?

ensembler quickmodel --target_uniprot_entry_name my_protein --uniprot_domain_regex '^my protein' --template_pdbids ./my_pdb.pdb --no-loopmodel

jchodera commented 9 years ago

The -I to qsub opens an interactive session. If you want to submit the batch script non-interactively, it is just qsub submit-GPU.sh.

Can you post the contents of your batch script submit-GPU.sh?

jchodera commented 9 years ago

@danielparton: Can you tackle the last question?

On the side note, using ensembler if I want to only model missing parts of the protein crystal structure I have, what are my options? ensembler quickmodel --target_uniprot_entry_name my_protein --uniprot_domain_regex '^my protein' --template_pdbids ./my_pdb.pdb --no-loopmodel

mihirdate commented 9 years ago

Thanks John. For some reason -I did not run. But submitting it non interactively worked. The job is now running and producing output. Looks like it will be completed successfully.

Here is my submit script.

#!/bin/tcsh
# Batch script for MPI GPU job on the cbio cluster
# utilizing 4 GPUs, with one thread/GPU
#walltime : maximum wall clock time (hh:mm:ss)
#PBS -l walltime=04:00:00
# join stdout and stderr
#PBS -j oe
# spool output immediately
#PBS -k oe
# specify GPU queue
#PBS -q gpu
# nodes: number of nodes
#ppn: number of processes per node
#gpus: number of gpus per node
#GPUs are in 'exclusive' mode by default, but 'shared' keyword sets them to shared mode.
#PBS -l nodes=1:ppn=1:gpus=1:shared
#PBS -l mem=4G
#PBS -q active
# export all my environment variables to the job
#PBS -V
# job name (default = name of script file)
#PBS -N myjob
# specify email for notifications
#PBS -M username@email.com
#mail settings (one or more characters)
#n: do not send mail
#a: send mail if job is aborted
#b: send mail when job begins execution
#e: send mail when job terminates
#PBS -m n
#filename for standard output (default = <job_name>.o<job_id>)
#at end of job, it is in directory from which qsub was executed
##PBS -o myoutput
# Change to working directory used for job submission
cd $PBS_O_WORKDIR

ensembler quickmodel --target_uniprot_entry_name EGFR_HUMAN --uniprot_domain_regex '^Protein kinase' --template_pdbids 1M14,4AF3 --no-loopmodel

@danielparton @jchodera : If you wish we can close this issue and I will reopen new one on usage of Ensembler. Let me know.

jchodera commented 9 years ago

@mihirdate, can you reformat the pasted code with "fenced code blocks"? The formatting is getting all screwed up.

See here: https://help.github.com/articles/github-flavored-markdown/#fenced-code-blocks

mihirdate commented 9 years ago

Oops. Sorry about that. Done!

jchodera commented 9 years ago

Much better, thanks!

jchodera commented 9 years ago

Glad it worked! Go ahead and close this issue and open new ones for other issues.

mihirdate commented 9 years ago

By the way, the script worked when I submitted it non interactively. Example of ensembler produced appropriate output.

mihirdate commented 9 years ago

Thanks John and Danny. Closing this now.