prefixcommons / data-ingest

0 stars 3 forks source link

Fix ping service #21

Open jmcmurry opened 7 years ago

jmcmurry commented 7 years ago
todo status pattern failing URI
make note doesn't seem to aspire to being resolvable in bioportal http://purl.bioontology.org/ontology/SBO/SBO:$id http://purl.bioontology.org/ontology/SBO/SBO:0000262
make note Doesn't seem to aspire to being resolvable in bioportal http://purl.bioontology.org/ontology/MA/MA:$id http://purl.bioontology.org/ontology/MA/MA:0002502
track down error This is the wrong pattern for amigo; where is it from? Should be http://amigo2.berkeleybop.org/amigo/term/GO:$id http://amigo2.berkeleybop.org/cgi-bin/amigo2/amigo/term/GO:$id http://amigo2.berkeleybop.org/cgi-bin/amigo2/amigo/term/GO:0006915
track down error I think this is an isDeprecated issue http://www.obofoundry.org/ro/#OBO_REL:is_a$id http://www.obofoundry.org/ro/#OBO_REL:is_a
try again This is I believe correct but server temporarily down; fix timeline unknown http://compbio.charite.de/hpoweb/showterm?id=HP:$id http://compbio.charite.de/hpoweb/showterm?id=HP:0000118
try again seems to be working now; perhaps was transient outage? http://pir.georgetown.edu/cgi-bin/pro/entry_pro?id=PR:$id http://pir.georgetown.edu/cgi-bin/pro/entry_pro?id=PR:000000024
putmantime commented 7 years ago

I failed to apply the same obsolete=True filter on the ping module that I was using to create the list of agents. I have now applied that an both RO and the deprecated amigo2 services are no longer being pinged, and no longer in the list..

jmcmurry commented 7 years ago

Great. The unresolvable purls in bioportal are the only real sticky wickets then.

putmantime commented 7 years ago

Here is the updated list; still not getting a response from HP and PR.


[
  {
    "URIexample": "http://purl.bioontology.org/ontology/MA/MA:0002502",
    "URIpattern": "http://purl.bioontology.org/ontology/MA/MA:$id"
  },
  {
    "URIexample": "http://purl.bioontology.org/ontology/SBO/SBO:0000262",
    "URIpattern": "http://purl.bioontology.org/ontology/SBO/SBO:$id"
  },
  {
    "URIexample": "http://pir.georgetown.edu/cgi-bin/pro/entry_pro?id=PR:000000024",
    "URIpattern": "http://pir.georgetown.edu/cgi-bin/pro/entry_pro?id=PR:$id"
  },
  {
    "URIexample": "http://compbio.charite.de/hpoweb/showterm?id=HP:0000118",
    "URIpattern": "http://compbio.charite.de/hpoweb/showterm?id=HP:$id"
  }
]
putmantime commented 7 years ago

But your right, http://pir.georgetown.edu/cgi-bin/pro/entry_pro?id=PR:000000024 is resolving when i paste in my browser. Not sure why it keeps coming back 404 when I do it from the python script

jmcmurry commented 7 years ago

Super, thanks, none of these are failing due to poorly composed URLS, so that's the real thing to avoid. Not sure what to do about bioportal. Perhaps actually remove them from resulting merged set?

putmantime commented 7 years ago

Another case where I should have studied the xml better. Both have a state=down attribute. I can filter those out as well

jmcmurry commented 7 years ago

Well, those can be transient. I wouldn't necessarily filter them out. We can preserve the state=down flag in our merged record if we want, but we're essentially doing our own test, which is more up-to-date by definition.

putmantime commented 7 years ago

Ah, now I understand what the reliability scores mean. Well, would it be better to capture that from id.org, or generate our own?

jmcmurry commented 7 years ago

I think it is more important to do our own because it tests the uptime while also exposing situations in which we have composed the URL poorly, whether due to omitted prefix, duplicated prefix, casing issues, etc.