In Space.csv, except the items which has a space_typeL, or the space_name is unknown, all the rest (around 438 items) are checked by comparing their names and coordinates with openstreetmap and wikidata.
To verify the coordinates with Wikidata information, we have used a third party python library in the past which resulted a 52-items list which contains the space items which were not verified.
To avoid dependency issues, the python library is removed and direct Wikidata SPARQL queries is used. The new still_no_match_list contains 72 items right now.
This is because the SPARQL query is more strict with names.
For example, to query Baiyangdian in Wikidata will only return results like Baiyangdian Lake. Using the current SPARQL query, this item with space_nameBaiyangdian will not have a match.
Should we try to update the SPARQL query statement to support fuzzy search?
still_no_match_list (with using the python library):
In
Space.csv
, except the items which has aspace_type
L
, or thespace_name
isunknown
, all the rest (around 438 items) are checked by comparing their names and coordinates withopenstreetmap
andwikidata
.To verify the coordinates with Wikidata information, we have used a third party python library in the past which resulted a 52-items list which contains the space items which were not verified.
To avoid dependency issues, the python library is removed and direct Wikidata SPARQL queries is used. The new
still_no_match_list
contains 72 items right now.This is because the SPARQL query is more strict with names. For example, to query
Baiyangdian
in Wikidata will only return results likeBaiyangdian Lake
. Using the current SPARQL query, this item withspace_name
Baiyangdian
will not have a match.Should we try to update the SPARQL query statement to support fuzzy search?
still_no_match_list (with using the python library):
still_no_match_list (with using direct wikidata SPARQL service) :