Open CloCkWeRX opened 6 months ago
I like the idea of a new storefinder here which automatically detects the Next.js build identifier and then proceeds to download a static JSON file specified as a parameter to the storefinder. I don't think the storefinder can/should attempt to parse the JSON file, instead leaving it up to the individual spider. This is because every brand is free to format the JSON file however they like and there is no consistency outside of some other storefinder that may exist specifically for integration with Next.js.
Yeah it's not a literal store finder, but I agree a next helper could be useful.
https://curaleaf.com/_next/data/sjXHfcM099K6PZVZVtK3H/locations.json 7 results - 7 files
locations/spiders/burger_king_cz.py: 21 yield JsonRequest( 22: url=f"https://burgerking.cz/_next/data/{next_build_id}/restaurants.json", callback=self.parse_locations 23 )
locations/spiders/burger_king_pl.py: 15 next_build_id = response.xpath("//script[contains(@src, '_ssgManifest.js')]/@src").get().split("/")[3] 16: url = f"https://burgerking.pl/_next/data/{next_build_id}/restaurants.json" 17 yield JsonRequest(url=url, callback=self.parse_api)
locations/spiders/crumbl_cookies_us.py: 17 next_build_id = response.xpath("//script[contains(@src, '_ssgManifest.js')]/@src").get().split("/")[3] 18: url = f"https://crumblcookies.com/_next/data/{next_build_id}/en-US/stores.json" 19 yield JsonRequest(url=url, callback=self.parse_api)
locations/spiders/delikatesy_centrum_pl.py: 19 next_build_id = response.xpath("//script[contains(@src, '_ssgManifest.js')]/@src").get().split("/")[3] 20: url = f"https://www.delikatesy.pl/_next/data/{next_build_id}/sklepy.json" 21 yield JsonRequest(url=url, callback=self.parse_api)
locations/spiders/quick_be_lu.py: 24 ) 25: yield JsonRequest(f"https://www.quick.be/_next/data/{build_id}/fr/restaurants.json") 26
locations/spiders/teknikmagasinet.py: 8 start_urls = [ 9: "https://www.teknikmagasinet.se/_next/data/l27oTv8kIMOzrHw2WFLQi/sv/teknikmagasinet/find-your-store.json" 10 ]
locations/spiders/tommy_hr.py: 11 item_attributes = {"brand": "Tommy", "brand_wikidata": "Q12643718"} 12: start_urls = ["https://www.tommy.hr/_next/data/NQBnI1_5yBtg95innap3m/hr-HR/prodavaonice.json"] 13