node-js-libs / node.io

MIT License
1.8k stars 140 forks source link

How to access the resource URL from the getHtml callback? #137

Closed devasur closed 11 years ago

devasur commented 11 years ago
    var myjob = new nodeio.Job(options, {
        input:false,
        run: function () {
            for (id in options.args[0]){
                var partnerId = options.args[0][id].id;
                var headers = {"partnerId" : partnerId};
                this.getHtml('http://www.kiva.org/partners/' + partnerId,  function(err, $) {

//Is it possible to get the calling url string here? //Any techniques? });

            }
            console.log("Job Done");
            this.emit(0);
        }
    });

I am running the above code in a loop against different urls for web scrapping. Is it possible to get the called url inside the getHtml , Callback?

chriso commented 11 years ago

Why not change this

this.getHtml('http://www.kiva.org/partners/' + partnerId,  function(err, $) {
    //
});

to this

var url = 'http://www.kiva.org/partners/' + partnerId;
this.getHtml(url,  function(err, $) {
    console.log(url);
});

or am I missing something?

devasur commented 11 years ago

Since each loop makes url to a new value, and since getHtml is async, by the time getHtml() returns to the callback it only has the last set value for url. May be I am missing something basic. Sorry, very newbie here.

chriso commented 11 years ago

Ah, right. An IIFE is what you're after

var self = this;
for (id in options.args[0]){
    (function (partnerId) {
        var url = 'http://www.kiva.org/partners/' + partnerId;
        self.getHtml(url,  function(err, $) {
            console.log(url);
        });
    })(options.args[0][id].id);
    this.emit(0);
}
devasur commented 11 years ago

Thanks Chris. Some more learning for me.