obophenotype / human-phenotype-ontology

Ontology for the description of human clinical features
http://obophenotype.github.io/human-phenotype-ontology/
Other
285 stars 51 forks source link

intermittent problems retrieving HP over internet #742

Closed balhoff closed 8 years ago

balhoff commented 8 years ago

In my software I programmatically retrieve HP via passing the IRI to OWL API (http://purl.obolibrary.org/obo/hp.owl). Fairly frequently my script fails with a loading error for HP. This doesn't really happen with the other ontologies I'm using. But if I just run again, usually it works. The OWL API prints out something like this:

Parser: RDFXMLParser
org.xml.sax.SAXParseException; systemId: http://purl.obolibrary.org/obo/hp.owl; lineNumber: 104868; columnNumber: 69; XML document structures must start and end within the same entity.

I think @cmungall mentioned that he has seen this also. I just wanted to make an official report in case there is something about the hosting of HP that might occasionally cause broken downloads.

cmungall commented 8 years ago

Yes, I can't remember who originally reported - @jnguyenx or @DoctorBud? See also https://github.com/OBOFoundry/purl.obolibrary.org/issues/162

jnguyenx commented 8 years ago

Yeah I have the exact same issue with SciGraph, that's very annoying.

We plan to make archives of our set of data, so before a SciGraph load we'll copy the owls locally, and that should by-pass this issue for us https://github.com/monarch-initiative/monarch-devops/issues/3

cmungall commented 8 years ago

Don't reinvent the wheel here - see https://github.com/owlcollab/owltools/wiki/Import-Chain-Mirroring

It would be a good idea to put some of this into a reusable mavenable package that can be used in combination with any OWLAPI code

cmungall commented 8 years ago

cc @ShahimEssaid

drseb commented 8 years ago

There were multiple people reporting this problem (not only hp, but also chebi_import.owl). Most of them just switched to a newer version of Protégé and that seemed to work. Not sure how to trace this problem. Candidates are

Anyway here is the result of my wget http://purl.obolibrary.org/obo/hp.owl:

Resolving purl.obolibrary.org (purl.obolibrary.org)... 52.3.123.63 Connecting to purl.obolibrary.org (purl.obolibrary.org)|52.3.123.63|:80... connected. HTTP request sent, awaiting response... 302 Found Location: https://compbio.charite.de/jenkins/job/hpo/lastStableBuild/artifact/hp/hp.owl [following] --2016-03-01 10:09:56-- https://compbio.charite.de/jenkins/job/hpo/lastStableBuild/artifact/hp/hp.owl Resolving compbio.charite.de (compbio.charite.de)... 141.42.207.15 Connecting to compbio.charite.de (compbio.charite.de)|141.42.207.15|:443... connected.

Any ideas anyone?

jnguyenx commented 8 years ago

I have the feeling that it's the owl-api which has a too restrictive timeout set. Unfortunately it's not configurable on our side.

It's driving me nuts ;-)

On Tue, Mar 1, 2016 at 1:18 AM, Sebastian Köhler notifications@github.com wrote:

There were multiple people reporting this problem (not only hp, but also chebi_import.owl). Most of them just switched to a newer version of Protégé and that seemed to work. Not sure how to trace this problem. Candidates are

  • OWL-API
  • purl-redirects
  • jenkins instance.

Anyway here is the result of my wget http://purl.obolibrary.org/obo/hp.owl :

Resolving purl.obolibrary.org (purl.obolibrary.org)... 52.3.123.63 Connecting to purl.obolibrary.org (purl.obolibrary.org)|52.3.123.63|:80... connected. HTTP request sent, awaiting response... 302 Found Location: https://compbio.charite.de/jenkins/job/hpo/lastStableBuild/artifact/hp/hp.owl [following] --2016-03-01 10:09:56-- https://compbio.charite.de/jenkins/job/hpo/lastStableBuild/artifact/hp/hp.owl Resolving compbio.charite.de (compbio.charite.de)... 141.42.207.15 Connecting to compbio.charite.de (compbio.charite.de)|141.42.207.15|:443... connected.

Any ideas anyone?

— Reply to this email directly or view it on GitHub https://github.com/obophenotype/human-phenotype-ontology/issues/742#issuecomment-190628851 .

ShahimEssaid commented 8 years ago

I haven't had to customize timeouts before but they are set in this class: https://github.com/owlcs/owlapi/blob/a34eb27611a8e9c121f80fe43878cf0aebf4c8db/api/src/main/java/org/semanticweb/owlapi/io/AbstractOWLParser.java

There are also some JVM properties described here: http://docs.oracle.com/javase/6/docs/technotes/guides/net/properties.html

On Tue, Mar 1, 2016 at 10:42 AM, Jeremy notifications@github.com wrote:

I have the feeling that it's the owl-api which has a too restrictive timeout set. Unfortunately it's not configurable on our side.

It's driving me nuts ;-)

On Tue, Mar 1, 2016 at 1:18 AM, Sebastian Köhler <notifications@github.com

wrote:

There were multiple people reporting this problem (not only hp, but also chebi_import.owl). Most of them just switched to a newer version of Protégé and that seemed to work. Not sure how to trace this problem. Candidates are

  • OWL-API
  • purl-redirects
  • jenkins instance.

Anyway here is the result of my wget http://purl.obolibrary.org/obo/hp.owl :

Resolving purl.obolibrary.org (purl.obolibrary.org)... 52.3.123.63 Connecting to purl.obolibrary.org (purl.obolibrary.org )|52.3.123.63|:80... connected. HTTP request sent, awaiting response... 302 Found Location:

https://compbio.charite.de/jenkins/job/hpo/lastStableBuild/artifact/hp/hp.owl [following] --2016-03-01 10:09:56--

https://compbio.charite.de/jenkins/job/hpo/lastStableBuild/artifact/hp/hp.owl Resolving compbio.charite.de (compbio.charite.de)... 141.42.207.15 Connecting to compbio.charite.de (compbio.charite.de )|141.42.207.15|:443... connected.

Any ideas anyone?

— Reply to this email directly or view it on GitHub < https://github.com/obophenotype/human-phenotype-ontology/issues/742#issuecomment-190628851

.

— Reply to this email directly or view it on GitHub https://github.com/obophenotype/human-phenotype-ontology/issues/742#issuecomment-190847270 .

cmungall commented 8 years ago

Let's get this into the core OWLAPI

Also this supports what we actually need is a proper modularization of the owltools import chain mirroring mechanism

On 1 Mar 2016, at 11:15, Shahim Essaid wrote:

I haven't had to customize timeouts before but they are set in this class: https://github.com/owlcs/owlapi/blob/a34eb27611a8e9c121f80fe43878cf0aebf4c8db/api/src/main/java/org/semanticweb/owlapi/io/AbstractOWLParser.java

There are also some JVM properties described here: http://docs.oracle.com/javase/6/docs/technotes/guides/net/properties.html

On Tue, Mar 1, 2016 at 10:42 AM, Jeremy notifications@github.com wrote:

I have the feeling that it's the owl-api which has a too restrictive timeout set. Unfortunately it's not configurable on our side.

It's driving me nuts ;-)

On Tue, Mar 1, 2016 at 1:18 AM, Sebastian Köhler <notifications@github.com

wrote:

There were multiple people reporting this problem (not only hp, but also chebi_import.owl). Most of them just switched to a newer version of Protégé and that seemed to work. Not sure how to trace this problem. Candidates are

  • OWL-API
  • purl-redirects
  • jenkins instance.

Anyway here is the result of my wget http://purl.obolibrary.org/obo/hp.owl :

Resolving purl.obolibrary.org (purl.obolibrary.org)... 52.3.123.63 Connecting to purl.obolibrary.org (purl.obolibrary.org )|52.3.123.63|:80... connected. HTTP request sent, awaiting response... 302 Found Location:

https://compbio.charite.de/jenkins/job/hpo/lastStableBuild/artifact/hp/hp.owl [following] --2016-03-01 10:09:56--

https://compbio.charite.de/jenkins/job/hpo/lastStableBuild/artifact/hp/hp.owl Resolving compbio.charite.de (compbio.charite.de)... 141.42.207.15 Connecting to compbio.charite.de (compbio.charite.de )|141.42.207.15|:443... connected.

Any ideas anyone?

— Reply to this email directly or view it on GitHub < https://github.com/obophenotype/human-phenotype-ontology/issues/742#issuecomment-190628851

.

— Reply to this email directly or view it on GitHub https://github.com/obophenotype/human-phenotype-ontology/issues/742#issuecomment-190847270 .


Reply to this email directly or view it on GitHub: https://github.com/obophenotype/human-phenotype-ontology/issues/742#issuecomment-190858698

ShahimEssaid commented 8 years ago

I'm slowly working on an Ivy based (but Maven repository format) dependency and packaging tool. I'm planning on having a custom OWLOntologyManager for this functionality.

If anyone is already working on a mirroring solution, please let me know what repository format you are planning to work on. I really think we need to adopt one of the existing formats and not reinvent something custom for our ontology work. The Maven is the simplest and most widely used but the Ivy format has more freatures (which we probably won't need any time soon).

On Tue, Mar 1, 2016 at 1:15 PM, Chris Mungall notifications@github.com wrote:

Let's get this into the core OWLAPI

Also this supports what we actually need is a proper modularization of the owltools import chain mirroring mechanism

On 1 Mar 2016, at 11:15, Shahim Essaid wrote:

I haven't had to customize timeouts before but they are set in this class:

https://github.com/owlcs/owlapi/blob/a34eb27611a8e9c121f80fe43878cf0aebf4c8db/api/src/main/java/org/semanticweb/owlapi/io/AbstractOWLParser.java

There are also some JVM properties described here:

http://docs.oracle.com/javase/6/docs/technotes/guides/net/properties.html

On Tue, Mar 1, 2016 at 10:42 AM, Jeremy notifications@github.com wrote:

I have the feeling that it's the owl-api which has a too restrictive timeout set. Unfortunately it's not configurable on our side.

It's driving me nuts ;-)

On Tue, Mar 1, 2016 at 1:18 AM, Sebastian Köhler <notifications@github.com

wrote:

There were multiple people reporting this problem (not only hp, but also chebi_import.owl). Most of them just switched to a newer version of Protégé and that seemed to work. Not sure how to trace this problem. Candidates are

  • OWL-API
  • purl-redirects
  • jenkins instance.

Anyway here is the result of my wget http://purl.obolibrary.org/obo/hp.owl :

Resolving purl.obolibrary.org (purl.obolibrary.org)... 52.3.123.63 Connecting to purl.obolibrary.org (purl.obolibrary.org )|52.3.123.63|:80... connected. HTTP request sent, awaiting response... 302 Found Location:

https://compbio.charite.de/jenkins/job/hpo/lastStableBuild/artifact/hp/hp.owl

[following] --2016-03-01 10:09:56--

https://compbio.charite.de/jenkins/job/hpo/lastStableBuild/artifact/hp/hp.owl

Resolving compbio.charite.de (compbio.charite.de)... 141.42.207.15 Connecting to compbio.charite.de (compbio.charite.de )|141.42.207.15|:443... connected.

Any ideas anyone?

— Reply to this email directly or view it on GitHub <

https://github.com/obophenotype/human-phenotype-ontology/issues/742#issuecomment-190628851

.

— Reply to this email directly or view it on GitHub < https://github.com/obophenotype/human-phenotype-ontology/issues/742#issuecomment-190847270

.


Reply to this email directly or view it on GitHub:

https://github.com/obophenotype/human-phenotype-ontology/issues/742#issuecomment-190858698

— Reply to this email directly or view it on GitHub https://github.com/obophenotype/human-phenotype-ontology/issues/742#issuecomment-190908465 .

cmungall commented 8 years ago

owltools slurp-import-chain simply mirrors the URLs structure in the directory layout

e.g.

$ find purl.obolibrary.org purl.obolibrary.org/obo/uberon purl.obolibrary.org/obo/uberon/bridge purl.obolibrary.org/obo/uberon/bridge/collected-adult-mammal.owl purl.obolibrary.org/obo/uberon/bridge/collected-embryonic-mammal.owl purl.obolibrary.org/obo/uberon/bridge/collected-mammal.owl purl.obolibrary.org/obo/uberon/bridge/collected-metazoa.owl purl.obolibrary.org/obo/uberon/bridge/collected-teleost.owl purl.obolibrary.org/obo/uberon/bridge/collected-tetrapod.owl purl.obolibrary.org/obo/uberon/bridge/collected-vertebrate.owl purl.obolibrary.org/obo/uberon/bridge/uberon-bridge-to-aao.owl purl.obolibrary.org/obo/uberon/bridge/uberon-bridge-to-ehdaa2.owl purl.obolibrary.org/obo/uberon/bridge/uberon-bridge-to-emapa.owl purl.obolibrary.org/obo/uberon/bridge/uberon-bridge-to-fma.owl purl.obolibrary.org/obo/uberon/bridge/uberon-bridge-to-ma.owl purl.obolibrary.org/obo/uberon/bridge/uberon-bridge-to-nif_grossanatomy.owl purl.obolibrary.org/obo/uberon/bridge/uberon-bridge-to-tao.owl purl.obolibrary.org/obo/uberon/bridge/uberon-bridge-to-xao.owl purl.obolibrary.org/obo/uberon/bridge/uberon-bridge-to-zfa.owl purl.obolibrary.org/obo/uberon/merged.owl purl.obolibrary.org/obo/uberon/uberon-simple-bridge.owl purl.obolibrary.org/obo/uberon.owl

On 1 Mar 2016, at 13:50, Shahim Essaid wrote:

I'm slowly working on an Ivy based (but Maven repository format) dependency and packaging tool. I'm planning on having a custom OWLOntologyManager for this functionality.

If anyone is already working on a mirroring solution, please let me know what repository format you are planning to work on. I really think we need to adopt one of the existing formats and not reinvent something custom for our ontology work. The Maven is the simplest and most widely used but the Ivy format has more freatures (which we probably won't need any time soon).

On Tue, Mar 1, 2016 at 1:15 PM, Chris Mungall notifications@github.com wrote:

Let's get this into the core OWLAPI

Also this supports what we actually need is a proper modularization of the owltools import chain mirroring mechanism

On 1 Mar 2016, at 11:15, Shahim Essaid wrote:

I haven't had to customize timeouts before but they are set in this class:

https://github.com/owlcs/owlapi/blob/a34eb27611a8e9c121f80fe43878cf0aebf4c8db/api/src/main/java/org/semanticweb/owlapi/io/AbstractOWLParser.java

There are also some JVM properties described here:

http://docs.oracle.com/javase/6/docs/technotes/guides/net/properties.html

On Tue, Mar 1, 2016 at 10:42 AM, Jeremy notifications@github.com wrote:

I have the feeling that it's the owl-api which has a too restrictive timeout set. Unfortunately it's not configurable on our side.

It's driving me nuts ;-)

On Tue, Mar 1, 2016 at 1:18 AM, Sebastian Köhler <notifications@github.com

wrote:

There were multiple people reporting this problem (not only hp, but also chebi_import.owl). Most of them just switched to a newer version of Protégé and that seemed to work. Not sure how to trace this problem. Candidates are

  • OWL-API
  • purl-redirects
  • jenkins instance.

Anyway here is the result of my wget http://purl.obolibrary.org/obo/hp.owl :

Resolving purl.obolibrary.org (purl.obolibrary.org)... 52.3.123.63 Connecting to purl.obolibrary.org (purl.obolibrary.org )|52.3.123.63|:80... connected. HTTP request sent, awaiting response... 302 Found Location:

https://compbio.charite.de/jenkins/job/hpo/lastStableBuild/artifact/hp/hp.owl

[following] --2016-03-01 10:09:56--

https://compbio.charite.de/jenkins/job/hpo/lastStableBuild/artifact/hp/hp.owl

Resolving compbio.charite.de (compbio.charite.de)... 141.42.207.15 Connecting to compbio.charite.de (compbio.charite.de )|141.42.207.15|:443... connected.

Any ideas anyone?

— Reply to this email directly or view it on GitHub <

https://github.com/obophenotype/human-phenotype-ontology/issues/742#issuecomment-190628851

.

— Reply to this email directly or view it on GitHub < https://github.com/obophenotype/human-phenotype-ontology/issues/742#issuecomment-190847270

.


Reply to this email directly or view it on GitHub:

https://github.com/obophenotype/human-phenotype-ontology/issues/742#issuecomment-190858698

— Reply to this email directly or view it on GitHub https://github.com/obophenotype/human-phenotype-ontology/issues/742#issuecomment-190908465 .


Reply to this email directly or view it on GitHub: https://github.com/obophenotype/human-phenotype-ontology/issues/742#issuecomment-190921559

drseb commented 8 years ago

I assume for now that this is not related to the way hp.obo/hp.owl is made available for download, but rather a problem with OWL-API. Please re-open if this is wrong.

cmungall commented 7 years ago

We're still seeing frequent timeouts for hp.owl, see @jnguyenx comments here: https://github.com/monarch-initiative/monarch-owlsim-data/commit/7ac3a50418007076d20f4aa61165fcd55bb40d72#commitcomment-21219160

one option is to use the S3/cloudfront delivery we use for many ontologies, e.g.

http://ontologies.berkeleybop.org/hp.obo http://ontologies.berkeleybop.org/hp.owl

in theory this should connect you to the fastest amazon region, cc @kltm

drseb commented 7 years ago

We should have worked on this last week. Let's switch to GH releases.

drseb commented 7 years ago

@jnguyenx please let me know if this is fixed now. hp.owl should now be served via GH.

jnguyenx commented 7 years ago

I think that hp.owl is fixed not, but I get another error from an import. which happens 80% of the time:

2017-03-09 10:03:57,053 ERROR (CommandRunner:4317) could not parse:http://purl.obolibrary.org/obo/hp.owl
org.semanticweb.owlapi.model.UnloadableImportException: Could not load imported ontology: <http://purl.obolibrary.org/obo/upheno/imports/hsapdv_import.owl> Cause: Server returned HTTP response code: 503 for URL: https://raw.githubusercontent.com/obophenotype/upheno/master/imports/hsapdv_import.owl

@cmungall is that yours?

balhoff commented 7 years ago

I got some of these 503s from GitHub this morning. I think it was a GitHub network problem today.

kltm commented 7 years ago

It's almost as if github shouldn't be a content server... ;)

drseb commented 7 years ago

...funny...

cmungall commented 7 years ago

it's better than hudson.

we can always go the s3/cloudfront route as noted here https://github.com/obophenotype/human-phenotype-ontology/issues/742#issuecomment-285213129

But for now this is good. Our stack should not be so fragile. We need to make steps like loading less network dependent. Split into two. An ontology compilation step and then a load step. This applies to everything - minerva, scigraph loads, etc.

robot now has mirror (a port of owltools slurp-import-chain). See https://github.com/ontodev/robot/pull/146

https://github.com/ontodev/robot/tree/master/examples#mirroring