alirezamika / autoscraper

A Smart, Automatic, Fast and Lightweight Web Scraper for Python
MIT License
6.24k stars 654 forks source link

Not able to scrape images from amazon and flipkart from product pages #72

Closed swaroopvemula closed 2 years ago

notChewy1324 commented 2 years ago

this scrapes text from websites

anoduck commented 2 years ago

this scrapes text from websites

It was able to scrape image links, but not the images themselves. Greatly disappointed.

lorey commented 2 years ago

It's one requests.get()?!

anoduck commented 2 years ago

It's one requests.get()?!

Allow me to clarify, in my script "image_link" is not the direct URL to the image, but a linked image that resolved to the image hosting service which hosted the image. Such as imagetwist.

The library was able to retrieve the first link leading to the image hosting service, but was not able to retrieve the direct url to the image on that hosting service.

So no, it is not just as simple as writing one requests.get(), not with using solely the autoscraper library.


A full script was written successfully employing requests_html, but it is very specific to just that site and just those image hosting services.

anoduck commented 2 years ago

Yeah, the issue needed closing.