b-cube / semantics

OWL ontologies to describe web services found by the BCube Nutch Crawler.
GNU General Public License v3.0
2 stars 2 forks source link

OSDD template query parameters not consistently structured across services #1

Closed roomthily closed 9 years ago

roomthily commented 9 years ago

In the MODAPS example, we have:

<Url type="text/html"
      xmlns:MODAPSParameters="http://modwebsrv.modaps.eosdis.nasa.gov/opensearchextensions/1.0/" template="http://modwebsrv.modaps.eosdis.nasa.gov/axis2/services/MODAPSservices/getOpenSearch?products={MODAPSParameters:products}&amp;collection={MODAPSParameters:collection?}&amp;start={time:start}&amp;stop={time:stop}&amp;bbox={geo:box}&amp;coordsOrTiles={MODAPSParameters:coordsOrTiles?}&amp;dayNightBoth={MODAPSParameters:dayNightBoth?}"/>

where all of the parameter definitions are wrapped in curly brackets but in a randomly selected OSDD, we have:

<?xml version="1.0" encoding="utf-8"?>
<OpenSearchDescription xmlns="http://a9.com/-/spec/opensearch/1.1/">
    <ShortName>CEOS</ShortName>
    <Description/>
    <InputEncoding>UTF-8</InputEncoding>
    <Image type="image/vnd.microsoft.icon" width="16" height="16"
        >http://www.ceos.org/templates/oceanwaves/favicon.ico</Image>
    <Url type="application/opensearchdescription+xml" rel="self"
        template="http://www.ceos.org/index.php?option=com_search&amp;view=remind&amp;format=opensearch"/>
    <Url type="text/html"
        template="http://www.ceos.org/index.php?option=com_search&amp;searchword={searchTerms}"/>
</OpenSearchDescription>

with one Url.template half correct and the other not correct. Point being, the endpoint extraction curently assumes it's the correct {param} structure and strips off the first and last chars no matter what.

roomthily commented 9 years ago

See semantics-preprocessors opensearch processor: https://github.com/b-cube/semantics-preprocessing/commit/461aeb74e55d62a1c3a4408c461f03019c9dd62d