peerigon / phridge

A bridge between node and PhantomJS
The Unlicense
519 stars 50 forks source link

Holding a reference to the phantom instance (JS-Object) outside of the ".spawn().then( [...] )" scope? #45

Closed Senci closed 8 years ago

Senci commented 8 years ago

Hey guys, so far I'm quite satisfied with Phridge. Thanks for all the hard work.

This is no real issure, maybe more like a question or feature request. I'm failing at the attempt to hold references to my phantom instances outside of the given scope. Ultimately I'm planning on juggling multiple phantom instances, which are hopefully working simultaneously. Although it might be better (in regards of performance) to spawn multiple instances of node and access them from "outside", I hoped for a solution within my node scope.

What I tried:

var phantomInstances = {};

function spawnPhantom (id) {
    return new Promise(function (resolve,reject) {
        phridge.spawn().then(function(phantom) {
            phantomInstances[id] = phantom;
            // required for communication from within the page scope to the node scope
            phantom.run(function(resolve) {
                var page = webpage.create();
                page.onCallback = function(data) {
                    console.log(data);
                };
            });
            resolve(phantom);
        });
    });
}

Analogously I wrote a function which injects JavaScript to a phantomInstance by id and filename (JS-File to inject). For testing purposes this JS-File currently only calls the prior defined .onCallback-function, thus logging something to the console. Unfortunately the Phantom instance is not accessible. Neither the Object i put into phantomInstances[id]nor the phantom instance returned by the resolve of my promise (spawnPhantom) seem to be accessible.

spawnPhantom('ph00').then(function(phantom) {
    // not working:
    phantom.run(function(resolve) {
        // does not get executed, tested to call console.log from here...
        page.injectJs('myScript.js');
    });
    // not working either:
    var ph00 = phantomInstances('ph00')
    ph00.run(
        [...];
    );
});

My first thought was that the phantom process might be disposed by the time I try to access it. This seems not to be the case as the phantom process is running until i close (SIGINT) the node process.

When I try the same tasks iteratively and within the first spawn()-scope, everything works fine. So please tell me whether what I'm trying is possible and what I'm doing wrong.

Best Regards, Senad

disclaimer: code simplified for readability.

jhnns commented 8 years ago

This should be possible. The PhantomJS child process won't be disposed unless you call dispose() or it has received a SIGINT. This has nothing to do with the function scope.

In your case, there might be several problems:

  1. In your first example, you call phantom.run(...) and then immediately resolve(phantom). Please be aware that phantom.run() is asynchronous, which means that you can't rely on page.onCallback until the promise has been resolved
  2. The page variable in your second example has never been initialized. If you're referring to the page you have created in your first example, you need to store the variable under this, because different calls of run() are executed in different scopes. In order to save state, phridge provides an empty object as this (as described in the README)

It would definitely be helpful if you could provide us with a minimal setup that demonstrates your specific problem.

Senci commented 8 years ago

Thanks for the answer. Your suggestions helped me understand phridge a little better. I realized that what I wanted to achive is not really performant the way I imagined it. Now I'm trying to implement a containerized solution (Docker Swarm) to be able to handle multiple simultaneous operations and do that multi-treaded.

Nonetheless I am curious if you are able to hold the reference to a phantom object within the nodejs scope. I've created a gist with a more complete example. In this simple example a phantom instance is created and stored in the object "phantomInstances" laying in the nodejs scope. The phantom object holds a page. Then I try to retreive the phantom object and inject a JavaScript file into the page, which is then supposed to print "Hello World!" to the console

jhnns commented 8 years ago

That's probably what you want

Senci commented 8 years ago

Oh okay, now I see where I did a mistake. Thank you for your support! :)