tjanczuk / edge

Run .NET and Node.js code in-process on Windows, MacOS, and Linux
http://tjanczuk.github.io/edge
Other
5.41k stars 640 forks source link

Help! I don't use Edge to scrape web information #268

Closed jiangwest closed 1 hour ago

jiangwest commented 9 years ago

Hi, I want to use edge.js to scrape the web information, e.g., ACM DL webpage.


return function (data, callback) {    

    var http = require('http');
    var cheerio = require("cheerio");
    var fs = require('fs');

    //data is the target web site URL
    var acmWeb=data;
    var pageData;

    http.get(acmWeb, function(res){
        res.setEncoding('utf8');
        res.on('data',function(chunk){
            pageData += chunk; //collect web info
        });

        res.on('end', function(){
            //to process data herre
            var $ = cheerio.load(pageData);

            $('meta').each(function(i, e) {  //filter
                //'\r\n' -- return and newline,
               //open a new file to save the result
                fs.appendFile('message.txt', $(e).attr('content') + '\r\n', function (err) {
                    if (err) throw err;  

                    console.log('It\'s saved!');
                });
            });
        });        
    });

   // it will return the processing data
    callback(null, pageData);
}
tjanczuk commented 9 years ago

I think you meant to call the callback function only after the HTTP request has completed?

Davevb commented 7 years ago

In my script I also use 'http' to call an url. But it's never called. Any thoughts?