Shared-Reality-Lab / IMAGE-server

IMAGE project server components
Other
2 stars 7 forks source link

Investigate Google maps embedding: what information can we obtain? #71

Closed jeffbl closed 2 years ago

jeffbl commented 3 years ago

When a website embeds a google map (not just taking an image of it that is static), can we get enough information so that we can get the location and such without having to work directly with the graphic? Things that would be useful:

Examples, which may be embedded in different ways:

https://www.pixar.com/contact-us

https://restauranttandem.com/contact/

https://lepegase.ca/nos-infos-pratiques/nous-contacter/

Added 20211005, possibly embedded differently: https://restaurantguru.com/Sushiyo-Montreal

jeffbl commented 3 years ago

Moved to August milestone. @gp1702 assigning to you as part of overall preprocessor work, to be reassigned as resources are available. If you feel this should not be within the ML team, let's discuss.

jeffbl commented 3 years ago

Doing actual assignent...

gp1702 commented 3 years ago

I am trying to decipher the notion of useful information in the context of a map.

The way I am reading into this is as follows: If the user is querying information about a page which has an embedded map, then perhaps the most useful information that could be provided to the user is just the geo-tagged location ?

I am not sure I understand the need for providing any other information. Let me know if this makes sense.

jeffbl commented 3 years ago

The idea is that extracting the latitude/longitude or equivalent might let us just bring it up in google maps or open street map, and create a rendering there, rather than try and make sense of it as a static graphic. So the question is what we can all extract without having to resort to ML or anything, to simplify the preprocessing step. Can discuss further at our meeting tomorrow morning if needed.

jeffbl commented 3 years ago

@Cybernide indicated last week that lat/lng is there, but still not clear how to get it. That alone is enough to get the first Autour-like experience working... @gp1702 Please let me know if this work item is still unclear after the maps meeting last week.

jeffbl commented 3 years ago

As mentioned today, @Cybernide is a resource to ping on this one.

Cybernide commented 3 years ago

So looking back on it, I think we automated this process:

  1. Get the Google Maps link from the embedded map
  2. Use regex to find the lat/long inside the URL itself

As an example, see two of the example URLs: https://maps.google.com/maps?ll=45.542418,-73.622002&z=14&t=m&hl=en-CA&gl=CA&mapclient=embed&cid=9956261843513529873 https://maps.google.com/maps?ll=45.535099,-73.577621&z=14&t=m&hl=fr&gl=US&mapclient=embed&q=1831%20Rue%20Gilford%20Montr%C3%A9al%2C%20QC%20H2H%201G6

Notice what is between "maps?ll=" and "&z"? Those are the lat/long coordinates

aidanwilliams09 commented 3 years ago

Multiple types of embedded google maps exist with different values. There's place mode, view mode, directions mode, streetview mode, search mode.

The url of an embedded place mode google map will ALWAYS have a destination listed as a place name, address, plus code, or place ID. Additionally, the url MAY have: 1) lat/long 2) zoom 3) map type (road/satellite) 4) language 5) region

view mode will always have lat/long and can have: 1) zoom 2) maptype 3) language 4) region

directions mode always has origin/destination which are written the same way as as the destination in place mode and will have a lot of optional values like the ones above

streetview will always have at least a lat/long or a panorama ID.

search mode will always have a search term (record+stores+in+Seattle) and can have a lat/long (not required)

https://developers.google.com/maps/documentation/embed/embedding-map

aidanwilliams09 commented 3 years ago

Maps Embedded API can be called for free as many times as needed, however this seems to be used just to get a map of a location and doesn't seem very useful for getting extra information from the location. Google has a geocoding API that looks like it could get us lat/long and some other info but this is charged at $5USD/1000 queries up to 100k, $4USD/1000 queries from 100k-500k and some type of custom billing for 500k+.

https://developers.google.com/maps/documentation/geocoding/overview https://developers.google.com/maps/documentation/geocoding/usage-and-billing

aidanwilliams09 commented 3 years ago

So looking back on it, I think we automated this process:

  1. Get the Google Maps link from the embedded map
  2. Use regex to find the lat/long inside the URL itself

As an example, see two of the example URLs: https://maps.google.com/maps?ll=45.542418,-73.622002&z=14&t=m&hl=en-CA&gl=CA&mapclient=embed&cid=9956261843513529873 https://maps.google.com/maps?ll=45.535099,-73.577621&z=14&t=m&hl=fr&gl=US&mapclient=embed&q=1831%20Rue%20Gilford%20Montr%C3%A9al%2C%20QC%20H2H%201G6

Notice what is between "maps?ll=" and "&z"? Those are the lat/long coordinates

This page has a regex example for lat/long: https://help.parsehub.com/hc/en-us/articles/226061627-Scrape-latitude-and-longitude-data-from-a-Google-Maps-link

Only issue is I don't know how often embedded google maps will have a lat/long in the query as it isn't required for most of the embedded maps options...

BenMacnaughton commented 3 years ago

Multiple types of embedded google maps exist with different values. There's place mode, view mode, directions mode, streetview mode, search mode.

The url of an embedded place mode google map will ALWAYS have a destination listed as a place name, address, plus code, or place ID. Additionally, the url MAY have:

  1. lat/long
  2. zoom
  3. map type (road/satellite)
  4. language
  5. region

view mode will always have lat/long and can have:

  1. zoom
  2. maptype
  3. language
  4. region

directions mode always has origin/destination which are written the same way as as the destination in place mode and will have a lot of optional values like the ones above

streetview will always have at least a lat/long or a panorama ID.

search mode will always have a search term (record+stores+in+Seattle) and can have a lat/long (not required)

https://developers.google.com/maps/documentation/embed/embedding-map

Seems like a good focus to start with would be place mode and view mode - Both would have a lot of common use cases and could serve as a stepping stone towards the others. Furthermore, the lat/long should be easy to scape if present in API call or discover by reproducing similar call. Direction mode might be best to just pull as set of directions plus estimated time etc., without any ML, while streetview mode might be most effectively handled as an outdoor image to be passed to existing preprocessors.

BenMacnaughton commented 3 years ago

For zoom level, the google maps API accepts a value between 0 (whole world) and 21 (individual buildings). This is an optional parameter so it may not be retrievable from every query. No default value is specified by google so may vary depending on location as the max zoom value is sometimes capped lower than 21 depending on the available data in a given region. A lookup table relating zoom value to approximate perspective in plain English may be useful.

BenMacnaughton commented 3 years ago

Maps Embedded API can be called for free as many times as needed, however this seems to be used just to get a map of a location and doesn't seem very useful for getting extra information from the location. Google has a geocoding API that looks like it could get us lat/long and some other info but this is charged at $5USD/1000 queries up to 100k, $4USD/1000 queries from 100k-500k and some type of custom billing for 500k+.

https://developers.google.com/maps/documentation/geocoding/overview https://developers.google.com/maps/documentation/geocoding/usage-and-billing

For pulling out lon/lat from API calls that do not have them in the request body, the free Google Places API can be used by passing the "q" parameter (place description or placeID https://developers.google.com/maps/documentation/embed/embedding-map#place_mode) from a place-type Embed API call to a new Places call. This returns simple data such as lat/long as well as additional info about that place which may be of value to the user.

https://developers.google.com/maps/documentation/places/web-service/place-id https://developers.google.com/maps/documentation/places/web-service/search-find-place

jeffbl commented 3 years ago

Great! So to summarize, it sounds like the URL you'll encounter in the DOM may have lat/lng already embedded, but if not, you'll have to make a (free, if using Google Places API, correct?) remote query to obtain it. Ideally, this would all happen locally in the browser extension (except of course any necessary google API call), so that if something fails, we can immediately tell the user that we can't get anything for that embedded map, without having to wait for a an IMAGE server round trip.

Even better, if we can't get anything from an embedded map, we shouldn't even add a stop on it, or show the menu option. We want to avoid offering the user options that we know won't result in anything useful, since that will just create frustration. So, my uninformed concept of how it woudl work overall would be that while scanning through the DOM, any embedded map links would immediately be parsed to determine if we can get the lat/lng, and only the ones where we can would be "enabled" for an IMAGE query. That way, if we offer it, we're pretty sure it'll succeed. If this adds significant complication, ping to discuss. Thanks!

BenMacnaughton commented 3 years ago

Great! So to summarize, it sounds like the URL you'll encounter in the DOM may have lat/lng already embedded, but if not, you'll have to make a (free, if using Google Places API, correct?) remote query to obtain it. Ideally, this would all happen locally in the browser extension (except of course any necessary google API call), so that if something fails, we can immediately tell the user that we can't get anything for that embedded map, without having to wait for a an IMAGE server round trip.

Even better, if we can't get anything from an embedded map, we shouldn't even add a stop on it, or show the menu option. We want to avoid offering the user options that we know won't result in anything useful, since that will just create frustration. So, my uninformed concept of how it woudl work overall would be that while scanning through the DOM, any embedded map links would immediately be parsed to determine if we can get the lat/lng, and only the ones where we can would be "enabled" for an IMAGE query. That way, if we offer it, we're pretty sure it'll succeed. If this adds significant complication, ping to discuss. Thanks!

Correct. If the lat/long are not explicitly in the API call we can use the "place" referenced in that call to make a (free) call of our own to the places API. This also has the ability to give us other info about that place (depending on the kind of place). Makes sense that we should be doing this with the extension and pre-validating as to not offer an interpretation of a map that we cannot process.

@jeffbl

jeffbl commented 2 years ago

@Cybernide more investigation necessary to figure out what would meet user needs?

Cybernide commented 2 years ago

Yep. See https://github.com/Shared-Reality-Lab/audio-haptic-graphics-UX/issues/22

Cybernide commented 2 years ago

Never mind. If this is what we can get from Google Maps API, then we're OK here. We'll move investigations over to OSM. I say close. @jeffbl

jeffbl commented 2 years ago

OK. @Cybernide Please open issue(s) for specific additional work items based on the above list, for user needs, when needed.

@BenMacnaughton @aidanwilliams09 If there is anything remaining on this, can you please break out as a new, clear work item, then close this one? (e.g., the work to do the call to get lat/lng if not immediately available?) If there is additional investigation under this item, please make clear, otherwise log any additional items, and let's close this as it has become unwieldy.

BenMacnaughton commented 2 years ago

@jeffbl seems to me like it can be closed for now and we can open new issues if the user experience team deems that we need additional data.

jeffbl commented 2 years ago

@BenMacnaughton Is there an open item for "the call to get lat/lng if not immediately available". If not, that does need to be logged, correct?

BenMacnaughton commented 2 years ago

True. I will open that now