Open FStriewski opened 1 year ago
@FStriewski you don't need to ingest a copy of the dataset. You can use the API: https://api3.geo.admin.ch/services/sdiservices.html#search
Thanks! So how does this scale for a production system? Is there any limit on the number of requests per time intervall that might cause issues down the line? Or won`t that apply because both services will be hosted by Swisstopo?
@davidoesch can you answer here?
geo.admin.ch/terms-of-use
20 req / minutes
But: I don't think that the peaks from the requests from a normal usage day will hit the infra
So : on normal usage : no danger Just don't scrape the data via services
Thanks David.
I checked the Redis documentation yesterday, however, its geospatial indexing / geosearch features do not support bounding boxes but point coordinate tuples, distances and radius/box (https://redis.io/commands/geosearch/).
While there is probably a way to put this to use, I was wondering if we are about to overengineer things. If I got you right, the goal is to get the Kanton (or Bund + Geodienste) of the location by Swissnames. Why not have the user provide that information in the first place, e.g. by a dropdown in the UI and a parameter in the API? We planned to add some filters anyway. This way we also don`t have to worry about the issue that some names are used in multiple Kantons.
Am I missing something?
with a geosearch in REDIS in combination with the location search of api.geo.admin.ch lik https://api3.geo.admin.ch/rest/services/api/SearchServer?searchText=Lenzburg&type=locations for the following use case: I Have Search "Raumkonzepte in Lenzburg" . REDIS checks if the strings are a) just fill words like "in, at," etc and then checks if the Nouns in the search are either in redis DB itself or give a result in location search of api.geo.admin.ch. if the latter is the case , use the result of api.geo.admin.ch to extract lat lon from the repsonse and then pass it to redis geosearch to provide a list of possible results
So to have a user interface to enter "Raumkonzepte in Lenzburg" and then get a list of results with links similar as today in the POC geoharvester, just with an additional Field containng the Location Name ( if there are multiple results
To achieve this use case, you could implement a multi-step process:
1.Preprocess the search query by removing any stop words like "in, at," etc. This will help to identify the important keywords in the search query that can be used for further processing.
3.Identify the nouns in the search query and check if they are present in the Redis database. You could use a natural language processing (NLP) library like spaCy or NLTK to extract the nouns from the search query.
5.If the nouns are not present in the Redis database, then use the location search of api.geo.admin.ch to obtain the lat/lon coordinates for the location. You could make an HTTP GET request to the API endpoint with the search query as a parameter, and then parse the JSON response to extract the coordinates.
7.Once you have the lat/lon coordinates, you can use Redis geosearch to find all the possible results near that location. Redis supports geospatial queries through the use of the GEOADD and GEORADIUS commands. You could add the location to Redis using the GEOADD command and then query for nearby locations using the GEORADIUS command with the coordinates and a search radius.
9.Finally, you can return the list of possible results to the user. You could sort the results by distance from the search location to provide a ranked list of results.
search_query = "Raumkonzepte in Lenzburg"
Step 1: Preprocess the search query by removing stop words
preprocessed_query = remove_stop_words(search_query)
Step 2: Check if nouns in the search query are in Redis database
nouns = extract_nouns(preprocessed_query)
redis_results = query_redis(nouns)
Step 3: If no results found in Redis, use location search API to get lat/lon
if not redis_results:
location_results = query_location_api(preprocessed_query)
lat, lon = extract_lat_lon(location_results)
redis_results = query_redis_by_location(lat, lon)
Step 4: Use Redis geosearch to find all possible results near location
geosearch_results = query_redis_by_geosearch(lat, lon, radius)
Step 5: Return a list of ranked results to the user
ranked_results = rank_results(geosearch_results)
return ranked_results
Note that the pseudocode assumes the implementation of several helper functions, such as remove_stop_words, extract_nouns, query_redis, query_location_api, extract_lat_lon, query_redis_by_location, query_redis_by_geosearch, and rank_results. You would need to implement these functions using appropriate libraries and APIs, depending on your specific programming language and environment.
Is it overengineered? Maybe
Yes, I had something similar in mind. Stopword removal and noud detection (afaik) we already have implemented.
I see a couple of potential problems:
So thats why I was thinking "have the user provide more context". I am not too familiar with the spatial extend of various datasets - so thinking only on Kanton-level might not be good enough, indeed.
Then again, what is the problem we really want to solve here? Better matching of query and results? Couldn`t we achieve this easier with
To take stuff simple, IMHO the "search by location" is just a filter. We can set kind of a check box "Search/Filter by location": if checked we present the user a little map where we ask him/her to identify an area of interest by drawing a point/rectangle. The "text" search and the "Search by location" can work together or independantly. E.g:
search by location via filter is an easy solution, already availbale by geocat.ch (selecting Provider on the left side) or via GUI a bbox as in eg https://suche.kartenportal.ch/#bbox=7.959899902343747,46.27529514370323,9.756774902343748,47.16790406422331&q=&date_from=0&date_to=9999&scale_from=&scale_to=&libraries=&scanned_only=0&series=&map=road .
caveat: another GUI element you have to deal with: a map and toggle buttons to actviate or deactivate. Implementing a map based viewer seems to be easiest way
however: the user usually has a a question, and state of the art these days ( see the raise of those LLM's like chat GPT) is: you state your question --- no more fiddling around with user interfaces... solving it all from one singel textbox would be innnovative
Adressing @FStriewski points
-> Take the first ten results you get from https://api.geo.admin.ch/rest/services/api/SearchServer?searchText=Bern&origins=kantone,district,gg25&type=locations What we do then in map.geo.admin.ch search: we first show the result for "Kantone", then second "Bezirk"district then third "Gemeinde" gg25 the results in those group are ranked by "rank" and "weight"
Rank the layers the way that you first list the datasets from BUND and GEODIENSTE if they cover the topic then the other datasets
-having the user define a target Kanton (optional argument) -> this can be done in the Search itself IMHO with BOOL paramter like in my poc with ""Raumkonzept" "KT_AG""
-Extract geographic names within preprocessing from the abstract, and store it in a keyword field, to check for exact matches? -> Yeah thought about that as well: use OSM Overpass to get the 10 biggest cities / places and store them in keywords But redis will explode cause it will have a lot od dupplicates
Requested as feature by Pasquale/David 21/3
User story:
As a user I want to get results based on location keywords in my query (using Swissnames).
Description:
Swissnames is the largest available collection of geographic names for Switzerland and allows their translation into coordinates (LV03 LN02, LV95 LN02). The dataset comes in various formats (.gdb, .shp., .csv) for point, line and polygon features. If a geographic name is used in the search query, it could be used to filter results based the coordinates (using Redis geospatial indexing) to narrow down the results.
Cases:
Considerations:
Ressources: