KnowWhereGraph / kwg-faceted-search

Knowledge Explorer: The search interface to KnowWhereGraph
http://stko-kwg.geog.ucsb.edu
3 stars 0 forks source link

Duplicate search results #265

Closed ThomasThelen closed 2 years ago

ThomasThelen commented 2 years ago

When doing searches for hazards, it's possible to get duplicate results in the table. I've noticed this with the keyword search and the zip code search. I'm pretty sure this is a regression. We dealt with duplicate results in the past, but something may have changed since then. The change would have been post 1.0.0

To reproduce:

  1. Go to the hazards tab
  2. Search the zip code 93105
  3. Check a few of the results to see if there are duplicates (confirm their URI's are the same)
ThomasThelen commented 2 years ago

I think this is actually related to #263. In that issue, we're seeing two different dates for some of the hazards in the SPARQL results. Since they're not concatenated we'll get two rows. The real issue is that we're retrieving an incorrect time. 63b977702d9caddea28bc450a70e0438ed6be7f8 was a recent change to this area that involved duplicates and time values. I've narrowed part of the problem down to kwg-ont:hasTemporalScope|sosa:isFeatureOfInterestOf/sosa:phenomenonTime ?time .

zilongliu-geo commented 2 years ago

The problem is not with the predicate used to find temporal information, because some types of hazards (e.g., earthquakes) don't have temporal scopes. I have changed the query a little bit so that we won't have wrong and duplicate records with different temporal information. Basically, the solution is to select the first retrieve date/datetime associated with a hazard entity. See https://github.com/KnowWhereGraph/kwg-faceted-search/commit/a69f563f366aaf9ff373c60ed61d32e1b7bccb2b