KnowWhereGraph / kwg-faceted-search

Knowledge Explorer: The search interface to KnowWhereGraph
http://stko-kwg.geog.ucsb.edu
3 stars 0 forks source link

Unable to search for hurricanes in Louisiana #252

Open ThomasThelen opened 2 years ago

ThomasThelen commented 2 years ago

Kitty reported an issue where if you're in the search page and

  1. Select Louisiana in the places dropdown
  2. Select the NOAAHurricane facet
  3. Note that you'll get 0 results

I think that this is happening because Hurricanes are connected to NWZones instead of administrative regions (to my understanding).

An example of finding hurricanes in a national weather zone (in Louisiana)

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX kwg-ont: <http://stko-kwg.geog.ucsb.edu/lod/ontology/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX sosa: <http://www.w3.org/ns/sosa/>
PREFIX geo: <http://www.opengis.net/ont/geosparql#>
PREFIX time: <http://www.w3.org/2006/time#>
PREFIX kwgr: <http://stko-kwg.geog.ucsb.edu/lod/resource/>
select distinct ?entity ?label ?time ?wkt {
        ?entity rdf:type kwg-ont:NOAAHurricane .
        ?entity rdfs:label ?label .

        ?entity kwg-ont:sfWithin ?place.
        values ?place {kwgr:NWZone.37080}
} LIMIT 20 OFFSET 0

Opposed to what we're doing in the user interface, searching by administrative region which will return 0 results

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX kwg-ont: <http://stko-kwg.geog.ucsb.edu/lod/ontology/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX sosa: <http://www.w3.org/ns/sosa/>
PREFIX geo: <http://www.opengis.net/ont/geosparql#>
PREFIX time: <http://www.w3.org/2006/time#>
PREFIX kwgr: <http://stko-kwg.geog.ucsb.edu/lod/resource/>
select distinct ?entity ?label ?time ?wkt {
        ?entity rdf:type kwg-ont:NOAAHurricane .
        ?entity kwg-ont:sfWithin ?place.
         values ?place {kwgr:Earth.North_America.United_States.USA.19_1 kwgr:Earth.North_America.United_States.USA.19.10_1 kwgr:Earth.North_America.United_States.USA.19.11_1}
} LIMIT 20 OFFSET 0
zilongliu-geo commented 2 years ago

I see. The best way to retrieve hazards within an area of interest is always to use the s2 cells to find the associated regions. I would suggest updating the s2 cell usage in hazard queries. I will do it later.

zilongliu-geo commented 2 years ago

The below query should retrieve hurricanes located in Louisiana, but no result will show up.

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX kwg-ont: <http://stko-kwg.geog.ucsb.edu/lod/ontology/>
PREFIX sosa: <http://www.w3.org/ns/sosa/>
PREFIX time: <http://www.w3.org/2006/time#>
PREFIX geo: <http://www.opengis.net/ont/geosparql#>
PREFIX kwgr: <http://stko-kwg.geog.ucsb.edu/lod/resource/>
select distinct ?entity ?label ?time ?wkt
{
    ?entity rdf:type ?type; 
            rdfs:label ?label;
            kwg-ont:hasTemporalScope|sosa:isFeatureOfInterestOf/sosa:phenomenonTime ?time.                
    optional
    {
        ?entity geo:hasGeometry/geo:asWKT ?wkt.
    }
    ?type rdfs:subClassOf kwg-ont:Hazard.
    ?entity kwg-ont:sfWithin ?place.
    ?time time:inXSDDateTime|time:inXSDDate ?startTimeLabel;
                            time:inXSDDateTime|time:inXSDDate ?endTimeLabel.
    filter (?type in (kwg-ont:NOAAHurricane))

    ?entity kwg-ont:sfWithin|kwg-ont:sfWithin/kwg-ont:sfContains ?s2Cell .
    ?s2Cell rdf:type kwg-ont:KWGCellLevel13 .
    values ?placesConnectedToS2 {kwgr:Earth.North_America.United_States.USA.19_1}
    ?s2Cell kwg-ont:spatialRelation ?placesConnectedToS2.

} limit 20
ThomasThelen commented 2 years ago

We're definitely on the right track here. Chatting with the ontology team you're right that the S2 cells are going to be the most reliable. I would suggest starting with a smaller query and seeing if you can get it to work with a sample of hazard types (if not all of them). Then once it's working, incorporate it into the larger query above.

ThomasThelen commented 2 years ago

Putting this in the next milestone, 1.2.1

zilongliu-geo commented 2 years ago

This issue is solved with the latest commit by using s2 cell-based search for all hazard search scenarios: https://github.com/KnowWhereGraph/kwg-faceted-search/commit/0e8aba6dc59a1108d91433bb6c6fba2eb4329643. One potential issue coming with the temporary solution is that we might not be able to get all earthquakes we want. This is because earthquakes don't have s2 cells connected with them.

ThomasThelen commented 2 years ago

I see that #274 was linked this issue-when testing I found that I still wasn't able to find hurricanes in Louisiana. @zilongliu-geo can you check on your end?

zilongliu-geo commented 2 years ago

@ThomasThelen Check out the same search for Tornado or Hail. Then you will realize that the query would work for hazards like them instead of Hurricane, which means this is a data issue.

ThomasThelen commented 2 years ago

Are you able to pinpoint the issue with the data? The SPARQL query should be breaking on a particular pattern.

zilongliu-geo commented 2 years ago

@ThomasThelen This is because a Hurricane is linked to a NWZone and then this NWZone is linked to a s2 cell. But for a Tornado, it is associated with a s2 cell directly. I replaced ?entity kwg-ont:spatialRelation|kwg-ont:sfWithin/kwg-ont:spatialRelation ?s2Cell . with ?entity kwg-ont:spatialRelation ?s2Cell . because the usage of the previous predicate chain won't return results within the query limit time (even though it is correct).