gabceb / node-metainspector

Node npm for web scraping purposes. It scrapes a given URL, and returns you its title, meta description, meta keywords, an array with all the links, all the images in it, etc. Inspired by the metainspector Ruby gem
MIT License
129 stars 52 forks source link

added limit option in order to prevent downloading tons of data #5

Closed MiroRadenovic closed 10 years ago

MiroRadenovic commented 10 years ago

node-metainspector can explode if is provided an url that links to big file such an ISO file. to avoid this problem i have added an option parameter that limits the downloaded data made from the request object. with my change, the following code:

var MetaInspector  = require('node-metainspector');

var client = new MetaInspector("http://cdimage.debian.org/debian-cd/7.5.0/amd64/iso-cd/debian-7.5.0-amd64-CD-1.iso", { limit: 1024});

client.on("fetch", function(){
    console.log("Description: " + client.description());

    console.log("Links: " + client.links().join(","));
});

client.on("error", function(err){
    console.log(error);
});

client.on("limit", function(){
    console.log("limit reached!");
});

client.fetch();

avoids that the iso is fully downloaded, as the request is aborted after the first 1024 bytes and event "limit" is emmitted.

Let me know what do you think of this and i can write you also the tests. thank you, Miro

MiroRadenovic commented 10 years ago

i will close this pull request as some issues needs to be resolved first.