Closed frank-weindel closed 10 years ago
Mhmm interesting. There are two things I would like to know:
pending--
) is the delay still present?0.1.5
is the problem still present?Thanks for the quick reply. Commenting out the line made no difference, however, downgrading to 0.1.5
did fix it. It seems to be working well. Are there any major reasons to not run with 0.1.5? I'd also be curious about what is causing the slow down if you know and feel like explaining it. :)
I switched from HTTP to stdin/stdout with 1.x
which is more secure and - at least in my unit tests - way faster.
I think you're right by blaming the GC for this delay. Unfortunately PhantomJS only allows synchronous reading from stdin
. I'm using setTimeout, 0
to read the next line of stdin
to allow asynchronous callbacks to be executed. If I'd just use a while()
loop, no callback would ever be executed.
Can you upgrade back to 1.x
and gradually increase that timeout? Maybe 0
is to low and PhantomJS' GC can't catch up.
Hi I added pull request that should fix this issues.
I still got some issues with heartbeat. For example when I run async code in phantom but some operation on phantomjs is taking long time to run(blocking phantomjs event loop) - heartbeat messages are cluttering stdin.
Maybe it would be better to make it in form of ping-pong instead of heartbeat?
Other solution would be using something like this to communicate: https://github.com/ariya/phantomjs/issues/10980#issuecomment-21855869
What do you think?
What exactly do you mean by "cluttering stdin"? Does it block phantomjs in any way? Or do you read something else from stdin?
I mean that while phantom is busy with sync opperation node process is sending to it heartbeat messages. After finishing this sync operation phanton needs to read all of the sent heartbeat messages (it takes quite a lot of time depending on how long phantom was bussy and unable to read stdin messages)
Just an FYI. The latest version w/ @aju's PR seems to eliminate the accumulation issue. Thanks a lot.
@aju ping/pong would probably be better... but I'm surprised that it is a problem at all. That's just a matter of reading a few bytes...
Its not about a size :). When phantomjs get busy node is still sending heartbeat messages. Those messages are accumulating (because phantom is busy with some operation and can't handle them). All of those messages needs to be read from stdin one by one in the loop. I checked and "setTimeout(lopp, 0);" is triggered every ~8-10ms. So when phantom was busy for lets say 100ms we got 10 messages to read - this will take around 80ms-100ms if I'm correct.
I hope you now get my point.
That's why I think we should replace it with ping-pong version. I'm already experimenting with this.
Yep, you're right. Thanks for the explanation!
I'm trying to create an application that renders PNG files of many webpages one at a time. I'm doing it with one instance of phantom using phridge. For each webpage I want to render I create a new page via phridge's phantom.createPage() and then call run() on it. The run() callback (which runs in phantomjs land) simply does a page.open() and renders the PNG before returning.
I'm noticing that with every subsequent run() call I do, it takes more and more time for the phantomjs function to start executing.
The severity of the accumulating delay is related to how much data we're pushing through via the run() call, which in my case is about 500kb. This causes about an additional 2 seconds for each call (i.e. the first one takes 0 seconds, then 2 seconds, then 4 seconds, and so on) However, even without all of that data there is still an accumulation albeit a lot smaller (i.e. 0 seconds, 0.1 seconds, 0.2 seconds, 0.3 seconds, etc)
I'm thinking it has to do with the plumbing in writing to stdin. As in its getting clogged up and hence slowed down. When I add about a 4 second delay between subsequent run calls, the accumulation goes away. So my thinking is that some sort of garbage collection is allowed to run in this time.
Any ideas?