nprapps / book-concierge

A concierge for every year
https://apps.npr.org/best-books/
Other
34 stars 12 forks source link

Amazon scraper: Pull page numbers if available #95

Closed alykat closed 10 months ago

alykat commented 2 years ago

Useful for the books team to identify entries that would be tagged "rather long" or "rather short"

alykat commented 10 months ago

Beth reiterated her interest in such a feature. I briefly looked into the amazon scraper code, and it looks like what we're saving out to the CSV is everything that's offered in the search return. Would need to take more time to figure out how to query additional metadata, if that's offered in this API.

thomaswilburn commented 10 months ago

I think this is actually available in the API: https://webservices.amazon.com/paapi5/documentation/item-info.html#contentinfo

If you look at the bottom of tasks/lib/amazon-product.js, you'll find the keys being extracted, and can add this.

alykat commented 10 months ago

Ahhh! I totally missed the second script in the lib folder. This did the trick. Thank you so much!