Closed jeffbl closed 2 years ago
Moved to August milestone. @gp1702 assigning to you as part of overall preprocessor work, to be reassigned as resources are available. If you feel this should not be within the ML team, let's discuss.
Doing actual assignent...
I am trying to decipher the notion of useful information in the context of a map.
The way I am reading into this is as follows: If the user is querying information about a page which has an embedded map, then perhaps the most useful information that could be provided to the user is just the geo-tagged location ?
I am not sure I understand the need for providing any other information. Let me know if this makes sense.
The idea is that extracting the latitude/longitude or equivalent might let us just bring it up in google maps or open street map, and create a rendering there, rather than try and make sense of it as a static graphic. So the question is what we can all extract without having to resort to ML or anything, to simplify the preprocessing step. Can discuss further at our meeting tomorrow morning if needed.
@Cybernide indicated last week that lat/lng is there, but still not clear how to get it. That alone is enough to get the first Autour-like experience working... @gp1702 Please let me know if this work item is still unclear after the maps meeting last week.
As mentioned today, @Cybernide is a resource to ping on this one.
So looking back on it, I think we automated this process:
As an example, see two of the example URLs: https://maps.google.com/maps?ll=45.542418,-73.622002&z=14&t=m&hl=en-CA&gl=CA&mapclient=embed&cid=9956261843513529873 https://maps.google.com/maps?ll=45.535099,-73.577621&z=14&t=m&hl=fr&gl=US&mapclient=embed&q=1831%20Rue%20Gilford%20Montr%C3%A9al%2C%20QC%20H2H%201G6
Notice what is between "maps?ll=" and "&z"? Those are the lat/long coordinates
Multiple types of embedded google maps exist with different values. There's place mode, view mode, directions mode, streetview mode, search mode.
The url of an embedded place mode google map will ALWAYS have a destination listed as a place name, address, plus code, or place ID. Additionally, the url MAY have: 1) lat/long 2) zoom 3) map type (road/satellite) 4) language 5) region
view mode will always have lat/long and can have: 1) zoom 2) maptype 3) language 4) region
directions mode always has origin/destination which are written the same way as as the destination in place mode and will have a lot of optional values like the ones above
streetview will always have at least a lat/long or a panorama ID.
search mode will always have a search term (record+stores+in+Seattle) and can have a lat/long (not required)
https://developers.google.com/maps/documentation/embed/embedding-map
Maps Embedded API can be called for free as many times as needed, however this seems to be used just to get a map of a location and doesn't seem very useful for getting extra information from the location. Google has a geocoding API that looks like it could get us lat/long and some other info but this is charged at $5USD/1000 queries up to 100k, $4USD/1000 queries from 100k-500k and some type of custom billing for 500k+.
https://developers.google.com/maps/documentation/geocoding/overview https://developers.google.com/maps/documentation/geocoding/usage-and-billing
So looking back on it, I think we automated this process:
- Get the Google Maps link from the embedded map
- Use regex to find the lat/long inside the URL itself
As an example, see two of the example URLs: https://maps.google.com/maps?ll=45.542418,-73.622002&z=14&t=m&hl=en-CA&gl=CA&mapclient=embed&cid=9956261843513529873 https://maps.google.com/maps?ll=45.535099,-73.577621&z=14&t=m&hl=fr&gl=US&mapclient=embed&q=1831%20Rue%20Gilford%20Montr%C3%A9al%2C%20QC%20H2H%201G6
Notice what is between "maps?ll=" and "&z"? Those are the lat/long coordinates
This page has a regex example for lat/long: https://help.parsehub.com/hc/en-us/articles/226061627-Scrape-latitude-and-longitude-data-from-a-Google-Maps-link
Only issue is I don't know how often embedded google maps will have a lat/long in the query as it isn't required for most of the embedded maps options...
Multiple types of embedded google maps exist with different values. There's place mode, view mode, directions mode, streetview mode, search mode.
The url of an embedded place mode google map will ALWAYS have a destination listed as a place name, address, plus code, or place ID. Additionally, the url MAY have:
- lat/long
- zoom
- map type (road/satellite)
- language
- region
view mode will always have lat/long and can have:
- zoom
- maptype
- language
- region
directions mode always has origin/destination which are written the same way as as the destination in place mode and will have a lot of optional values like the ones above
streetview will always have at least a lat/long or a panorama ID.
search mode will always have a search term (record+stores+in+Seattle) and can have a lat/long (not required)
https://developers.google.com/maps/documentation/embed/embedding-map
Seems like a good focus to start with would be place mode and view mode - Both would have a lot of common use cases and could serve as a stepping stone towards the others. Furthermore, the lat/long should be easy to scape if present in API call or discover by reproducing similar call. Direction mode might be best to just pull as set of directions plus estimated time etc., without any ML, while streetview mode might be most effectively handled as an outdoor image to be passed to existing preprocessors.
For zoom level, the google maps API accepts a value between 0 (whole world) and 21 (individual buildings). This is an optional parameter so it may not be retrievable from every query. No default value is specified by google so may vary depending on location as the max zoom value is sometimes capped lower than 21 depending on the available data in a given region. A lookup table relating zoom value to approximate perspective in plain English may be useful.
Maps Embedded API can be called for free as many times as needed, however this seems to be used just to get a map of a location and doesn't seem very useful for getting extra information from the location. Google has a geocoding API that looks like it could get us lat/long and some other info but this is charged at $5USD/1000 queries up to 100k, $4USD/1000 queries from 100k-500k and some type of custom billing for 500k+.
https://developers.google.com/maps/documentation/geocoding/overview https://developers.google.com/maps/documentation/geocoding/usage-and-billing
For pulling out lon/lat from API calls that do not have them in the request body, the free Google Places API can be used by passing the "q" parameter (place description or placeID https://developers.google.com/maps/documentation/embed/embedding-map#place_mode) from a place-type Embed API call to a new Places call. This returns simple data such as lat/long as well as additional info about that place which may be of value to the user.
https://developers.google.com/maps/documentation/places/web-service/place-id https://developers.google.com/maps/documentation/places/web-service/search-find-place
Great! So to summarize, it sounds like the URL you'll encounter in the DOM may have lat/lng already embedded, but if not, you'll have to make a (free, if using Google Places API, correct?) remote query to obtain it. Ideally, this would all happen locally in the browser extension (except of course any necessary google API call), so that if something fails, we can immediately tell the user that we can't get anything for that embedded map, without having to wait for a an IMAGE server round trip.
Even better, if we can't get anything from an embedded map, we shouldn't even add a stop on it, or show the menu option. We want to avoid offering the user options that we know won't result in anything useful, since that will just create frustration. So, my uninformed concept of how it woudl work overall would be that while scanning through the DOM, any embedded map links would immediately be parsed to determine if we can get the lat/lng, and only the ones where we can would be "enabled" for an IMAGE query. That way, if we offer it, we're pretty sure it'll succeed. If this adds significant complication, ping to discuss. Thanks!
Great! So to summarize, it sounds like the URL you'll encounter in the DOM may have lat/lng already embedded, but if not, you'll have to make a (free, if using Google Places API, correct?) remote query to obtain it. Ideally, this would all happen locally in the browser extension (except of course any necessary google API call), so that if something fails, we can immediately tell the user that we can't get anything for that embedded map, without having to wait for a an IMAGE server round trip.
Even better, if we can't get anything from an embedded map, we shouldn't even add a stop on it, or show the menu option. We want to avoid offering the user options that we know won't result in anything useful, since that will just create frustration. So, my uninformed concept of how it woudl work overall would be that while scanning through the DOM, any embedded map links would immediately be parsed to determine if we can get the lat/lng, and only the ones where we can would be "enabled" for an IMAGE query. That way, if we offer it, we're pretty sure it'll succeed. If this adds significant complication, ping to discuss. Thanks!
Correct. If the lat/long are not explicitly in the API call we can use the "place" referenced in that call to make a (free) call of our own to the places API. This also has the ability to give us other info about that place (depending on the kind of place). Makes sense that we should be doing this with the extension and pre-validating as to not offer an interpretation of a map that we cannot process.
@jeffbl
@Cybernide more investigation necessary to figure out what would meet user needs?
Never mind. If this is what we can get from Google Maps API, then we're OK here. We'll move investigations over to OSM. I say close. @jeffbl
OK. @Cybernide Please open issue(s) for specific additional work items based on the above list, for user needs, when needed.
@BenMacnaughton @aidanwilliams09 If there is anything remaining on this, can you please break out as a new, clear work item, then close this one? (e.g., the work to do the call to get lat/lng if not immediately available?) If there is additional investigation under this item, please make clear, otherwise log any additional items, and let's close this as it has become unwieldy.
@jeffbl seems to me like it can be closed for now and we can open new issues if the user experience team deems that we need additional data.
@BenMacnaughton Is there an open item for "the call to get lat/lng if not immediately available". If not, that does need to be logged, correct?
True. I will open that now
When a website embeds a google map (not just taking an image of it that is static), can we get enough information so that we can get the location and such without having to work directly with the graphic? Things that would be useful:
Examples, which may be embedded in different ways:
https://www.pixar.com/contact-us
https://restauranttandem.com/contact/
https://lepegase.ca/nos-infos-pratiques/nous-contacter/
Added 20211005, possibly embedded differently: https://restaurantguru.com/Sushiyo-Montreal