sananth12 / ImageScraper

:scissors: High performance, multi-threaded image scraper
GNU General Public License v3.0
761 stars 100 forks source link

Doesn't scrape images from page #99

Open dineshbvadhia opened 7 years ago

dineshbvadhia commented 7 years ago

Installed ImageScraper with pip and pointed it to https://www.wikiart.org/en/recently-added-artworks and the response was:

C:\Users\Think\VM\aml-1.6\dev>image-scraper -s C:\Users\Think\VM\watest\images https://www.wikiart.org/en/recently-added-artworks

ImageScraper

Requesting page.... Sorry, no images found.

Do you know what the problem is?

sananth12 commented 7 years ago

ImageScraper does not scrape images if they are injected into the html runtime. It looks like that's whats happening in the mentioned website

sananth12 commented 7 years ago

@dineshbvadhia have you tried using the --injected option?

dineshbvadhia commented 7 years ago

Had to install selenium which is not listed in requirements.txt but still not working.

C:\Users\Think\VM>image-scraper -s C:\Users\Think\VM\watest https://www.wikiart.org/en/recently-added-artworks --injected

ImageScraper

Requesting page....

Traceback (most recent call last): File "c:\users\think\anaconda3\lib\site-packages\selenium\webdriver\common\service.py", line 74, in start stdout=self.log_file, stderr=self.log_file) File "c:\users\think\anaconda3\lib\subprocess.py", line 947, in init restore_signals, start_new_session) File "c:\users\think\anaconda3\lib\subprocess.py", line 1224, in _execute_child startupinfo) FileNotFoundError: [WinError 2] The system cannot find the file specified

ssundarraj commented 7 years ago

The documentation is not very clear about this. I think you have to have PhantomJS installed and in your PATH.

nevertoday commented 2 years ago

ImageScraper does not scrape images if they are injected into the html runtime. It looks like that's whats happening in the mentioned website

but so many webpage are ajax or other mode to display image. so soso hope , this need will be support.