alltheplaces / alltheplaces

A set of spiders and scrapers to extract location information from places that post their location on the internet.
https://www.alltheplaces.xyz
Other
630 stars 213 forks source link

vans: use new Where2GetIt spider #4755

Open davidhicks opened 1 year ago

davidhicks commented 1 year ago

VansSpider (vans.py) currently uses a SitemapSpider, but the spider can now be significantly optimised to a single request with the introduction of the Where2GetIt spider in https://github.com/alltheplaces/alltheplaces/pull/4741

There are a few different application keys I have found so far:

  1. 967F95BC-D13C-11EC-BB41-D13F919C4603 from https://hosted.meetsoci.com/vanseu/index.html
  2. CFCAC866-ADF8-11E3-AC4F-1340B945EC6E from https://www.vans.com/en-us/store-locator

Note that for (2) there is an alternative end-point used (example) but the appkey probably works for the W2GI API too? Option 1 returns USA stores anyway, so might be the better key to use?

@Cj-Malone

Cj-Malone commented 1 year ago

I actually rewrote vans twice yesterday because it had a bunch of errors. First to a StructuredDataSpider, then to an API spider because it was more efficient. I'm open to it being rewritten to a Where2GetItSpider if it makes sense.

Note that for (2) there is an alternative end-point used example but the appkey probably works for the W2GI API too?

Yeah it should be the same data, just a different format.

Cj-Malone commented 1 year ago

I can see I never opened a PR for my branch. I can't remember why or any of the details. But this is a reminder to me to look into it when I'm back home, the current spider still has tones of errors and it collets stockists, not just branded stores.