pevers / images-scraper

Simple and fast scraper for Google
ISC License
224 stars 69 forks source link

Possible to save/download images scraped? #18

Closed gideonjones closed 7 years ago

gideonjones commented 7 years ago

Is it possible to save the image that results from the search? Alternatively to couple your output with the "request" node package, which could download the image based on the URL, how would one pass/return only the url of the output your application provides?

Apologies if the "issues" section isn't the right place for what is essentially a feature request, I realise it's not your job to teach me how to code :) but just thought you guys would be the best to ask. Thanks for you work!

pevers commented 7 years ago

Yes this is possible. You can use request to download the image and pipe it to a file using fs like in this example:

http://stackoverflow.com/questions/12740659/downloading-images-with-node-js

efstathiosntonas commented 6 years ago

Sorry for digging this old request, here's a complete code for downloading images just in case someone else wants to.

First npm install request

I trim the image url to take the filename, i havn't tested to see what happens on duplicate. Files are downloaded in the root folder of the project.

let Scraper = require ('images-scraper'),
    google = new Scraper.Google(),
    fs = require('fs'),
    request = require('request');

google.list({
    keyword: 'banana',
    num: 2,
    detail: true,
    nightmare: {
        show: true
    },
    rlimit: '10',           // number of requests to Google p second, default: unlimited
    timeout: 10000
})
    .then(function (res) {
        res.forEach((item)=> {
            let url = item.url;
            let name = url.lastIndexOf('/');
            let filename = url.substring(name + 1);
            console.log(filename)
            download(item.url, filename, function () {
                console.log('done')
            })
        })
    }).catch(function(err) {
    console.log('err', err);
});

// you can also watch on events
google.on('result', function (item) {
    console.log('out', item);
});

let download = function(uri, filename, callback){
    console.log(filename)
    request.head(uri, function(err, res, body){
        console.log('content-type:', res.headers['content-type']);
        console.log('content-length:', res.headers['content-length']);

        request(uri).pipe(fs.createWriteStream(filename)).on('close', callback);
    });
};