cmintey / wishlist

Wishlist is a self-hosted wishlist application that you can share with your friends and family. You no longer have to wonder what to get your family for the holidays, simply check their wishlist and claim any available item!
MIT License
50 stars 1 forks source link

Bug: URL scraper not getting specific item metadata from Amazon #94

Closed gearopolis closed 10 months ago

gearopolis commented 10 months ago

Describe the bug When adding a new item sold on Amazon.com to a wish list from my PC, the scraper is not getting the correct item name, price or image. This does not seem to be an issue with other online shopping urls. I also did not get this bug while adding the same item from my phone.

To Reproduce Steps to reproduce the behavior:

  1. While on computer, go to 'My Wishes' in the UI
  2. Click on the 'round plus' icon in the bottom right
  3. In the popup interface, paste an amazon in the top field, 'Item URL' _I used this_
  4. Click out of or tab out of the field
  5. See that the 'Item Name', 'Price' and 'Image URL' populate incorrectly (image is a Captcha image)

Expected behavior As it does with other online stores, expecting the fields to populate correctly

Screenshots image

image

Desktop (please complete the following information):

Smartphone (please complete the following information):

Additional context As mentioned in the description, I do not experience this bug on my phone. I tried copying the link from the Amazon app and pasting it in and it worked. Because that appeared to have been run through a url shortner, I also browsed to the same product and copied the url from the browser, then attempted to add it to wishlist; it worked as expected.

cmintey commented 10 months ago

Yeah unfortunately, Amazon doesn't like web scraping all that much, so sometimes if they can detect it, they will present a captcha, in which case, we can't do anything. I already have a retry in place that if we are presented with a captcha, we will retry and that fixes it sometimes, but in other cases it doesn't. I don't know that there is an easy workaround without making the scraper more complex, which I want to avoid. I will make it so that if we can't get by the captcha, the user will be presented with a message and no data will be filled into the form

gearopolis commented 10 months ago

That is a great work-around. It is not surprising that Amazon would limit that functionality. I was thinking about sharing this with the rest of the family and I think that could be a blocker for some of my less tech savvy family members. Thanks again for sharing and the great work!

gearopolis commented 10 months ago

I just circled back around to this and it works as described. Thanks!