hbz / oerworldmap.prototype

OER World Map Prototype
2 stars 4 forks source link

Enrich using geo data #8

Closed dr0i closed 10 years ago

dr0i commented 10 years ago

Consider having a geo point: lat=7.7928638&long=-72.202904 (also applies to having only literals such as country, city name, and street name by doing an other API lookup).

First thought: use http://geonames.org ID to link and get data. But then, geoname data is not clearly licensed and the API is restricted. We could provide an index on our own , but -> license .

Second thought: use http://gadm.geovocab.org/, which is at least free for non-commercial use(see footer for license note). Thus also gadm is sadly not Open Data, but may fit better into our project. Lookup via API http://gadm.geovocab.org/services/withinRegion?lat=7.7928638&long=-72.202904#point and get URIs .

I understand the importance of using a geo ID as identifier, even without getting data into our index, but the latter is important for us, too, isn't it? Other thoughts ?

literarymachine commented 10 years ago

In order to enable look-ups, the literal data is actually more important than the links. From a linked-data perspective, we should keep the links of course. API restrictions are fine for the editing front-end, at least for the prototype. The geonames-API works like a charm, making it possible to restrict to certain classes such as Countries and delivering RDF data that we can easily use:

http://api.geonames.org/search?q=United%20States&featureClass=A&featureCode=PCLI&username=demo&type=rdf

It seems that the actual problem is that the rate limit does not allow the geonames API to be used for the initial data conversion. At least for that, it should not pose a licencing problem to build an index for internal use?

acka47 commented 10 years ago

Obviously, we haven't clearly specified yet what data is essential and what kind of data can be neglected. I guess, here we should differ between data from other sources and data that will be submitted via the API. (When we have to much requirements for bulk loading, it will be hard to get initial data into the system.)

@dr0i would prefer primarily using the OSM API as he currently has problems with retrieving the information from geonames. Let's walk through an example.

Example OCWC geo description:

{
         "type" : "Feature",
         "geometry" : {
            "type" : "Point",
            "coordinates" : [
               -72.2027326,
               7.7926166
            ]
         },
         "properties" : {
            "name" : "Universidad Nacional Experimental del Táchira"
         },
         "id" : 4
      }

The OSM API provides this when requested the geo coordinates à la curl "http://nominatim.openstreetmap.org/reverse?format=json&json_callback=callbackIntegerWrapid&lat=7.7929643&lon=-72.2028959&addressdetails=1":

callbackIntegerWrapid({"place_id":"9148635604","licence":"Data \u00a9 OpenStreetMap contributors, ODbL 1.0. http:\/\/www.openstreetmap.org\/copyright","osm_type":"node","osm_id":"2518163862","lat":"7.7929643","lon":"-72.2028959","display_name":"Bomberos Paramillo, Av Universidad, Barrio El Lobo, Municipio San Crist\u00f3bal, T\u00e1chira, Regi\u00f3n de Los Andes, Venezuela","address":{"fire_station":"Bomberos Paramillo","road":"Av Universidad","hamlet":"Barrio El Lobo","county":"Municipio San Crist\u00f3bal","state":"T\u00e1chira","country":"Venezuela","country_code":"ve"}})

The resulting RDF should look something like this (in turtle):

@prefix schema: <http://schema.org/> .
@prefix geo: <http://www.geonames.org/ontology#>

urn:uuid:1d6047f0-9dfe-11e3-a5e2-0800200c9a66
    a schema:Organization ;
    schema:address [
     schema:addressCountry "Venezuela" ;
#   schema:addressLocality "" ; Leave this out because there is no "city" match via OSM.
#    schema:postalCode "50674" ; Also leave this out as there is no match.
#    schema:streetAddress "Jülicher Straße 6" ditto
     ]  ;
    geo:locatedIn <http://sws.geonames.org/3625428/> .

(This adds another property (geo:locatedIn) to the mix. I think this is better than repeating the schema:addressCountry property with a literal as well as with a URI.)

Proceeding like this, we would have a clear separation between the address in literals and geo URIs using schema:address for the address information as literals (in this case only country but in others probably more) and `geo:locatedIn`` for the country's geonames URI. I think this would be enough for the prototype. Any suggestions how to improve this approach without much effort?

acka47 commented 10 years ago

If we consistently model the descriptions like proposed above, we'd have to adjust the application profile accordingly. (See the draft profile at https://github.com/acka47/oerworldmap/blob/add-oer-ap/oerap.ttl.) Currently, there is no mention of geo:locatedIn in the AP and both :addressCountry and addressLocality are restricted to http://www.geonames.org/ontology#Feature.

dr0i commented 10 years ago

@acka47 have a look.: #56 should fix this issue #8.

acka47 commented 10 years ago

Closing. The discussion here is outdated. See #5 for the current discussion and up-to-date solutions.