chrome-php / chrome

Instrument headless chrome/chromium instances from PHP
MIT License
2.25k stars 274 forks source link

Killed: 9 #11

Closed agentmarine closed 6 years ago

agentmarine commented 6 years ago

Running a scan across the full site eventually fires off Killed: 9

tried implementing a logger interface from Log::class, but getting nothing in the log, is the implementation for this correct? or is it implemented differently?

(using the built in Log class from laravel which impliments the PSR\LoggerInterface)

$browser = $browserFactory->createBrowser([
    'headless'        => true,         // disable headless 
    'debugLogger'     => Log::class // will enable verbose mode
]);
agentmarine commented 6 years ago

Seems this was a really bad implimentation on my part where it was implimented inside a foreach loop. Adjusted it to re-use the page object with the navigation and then close the browser at the end of the job rather than every time it processes a single url from a domain.

gsouf commented 6 years ago

@agentmarine ok most likely it was a memory issue. Makes me think that Page::close is not implemented, could help in your case to use a single browser and close pages in the loop.

As for the logger you should not pass the classname, but instead an instance of the logger. If you pass a string (the calss name) then it will it will treat it as it is a resource string like "php://stdout".

By the way, I dont work with laravel but as far as I know laravel logger uses static method to access a singleton instance. Maybe can you do Log::getFacadeRoot to get the loggerInterface instance.

agentmarine commented 6 years ago

Hi @gsouf running into this issue again Originally I closed the browser at the end of the foreach loop, how ever i re-used the Page during this by creating the page before the foreach and then just navigating with it. I've attempted to close the page after I am done using Page::close but i get this error

[2018-04-27 09:11:53] local.ERROR: Call to undefined method HeadlessChromium\Page::close() {"exception":"[object] (Symfony\\Component\\Debug\\Exception\\FatalThrowableError(code: 0): Call to undefined method HeadlessChromium\\Page::close() at /Users/dansmacbook/projects/project-scrapper/app/Jobs/Scan.php:107)
[stacktrace]
$page = $browser->createPage();// <-- previously existed outside of foreach loop.
$page->navigate($url)->waitForNavigation();
$pageCookie = $page->evaluate ('document.cookie')->getReturnValue();
$page->close(); // <-- doesnt exist
gsouf commented 6 years ago

Yes that's what I said in the previous message, the close method does not exist. I guess you need it to save resources?

agentmarine commented 6 years ago

Don't particularly need to save any thing, primarily using it to load a page, force the JS to load and grab all the cookies after that and take a screenshot of the page in question.

$browserFactory = new BrowserFactory();
$browser = $browserFactory->createBrowser([
    'headless'        => true,         // enable headless mode
    'debugLogger'     => Log::getFacadeRoot() // will enable verbose mode
]);
$page = $browser->createPage();

foreach ($urls as $url)
{
    $page->navigate($url)->waitForNavigation();
    $pageCookie = $page->evaluate ('document.cookie')->getReturnValue();
    $pageCookie = preg_split ("/[&;]/", $pageCookie);
    ....
}
$browser->close();

this is a snippet of the code currently being used.

agentmarine commented 6 years ago

Looks like PhantomJS originally had this issue too. Phantom JS Issue #12900 I will attempt their fix locally, I'm also going to attempt to put this on our linux server to see if that will have any issues.

gsouf commented 6 years ago

@agentmarine I just released version 0.1.4 where you can close a page with $page->close();

agentmarine commented 6 years ago

@gsouf Great I will give that an attempt as well!

agentmarine commented 6 years ago

this is a full stack trace which then results in a Killed: 9 signal

laravel.log

gsouf commented 6 years ago

@agentmarine can you please create a code snippet that can reproduce this bug please?

agentmarine commented 6 years ago

OK I've added it here. I've removed the database interaction that i have. code sample.txt

gsouf commented 6 years ago

Thanks, I'll give a try hen I have a little time. My guess is that some data are accumulated and never cleaned that cause general memory outage.

agentmarine commented 6 years ago

seeing an issue after deploying this to our linux server

[2018-05-01 12:22:32] local.DEBUG: socket(1): → sending data:{"id":1225,"method":"Target.sendMessageToTarget","params":{"message":"{\"id\":1224,\"method\":\"Runtime.evaluate\",\"params\":{\"awaitPromise\":true,\"returnByValue\":true,\"expression\":\"document.cookie\"$
[2018-05-01 12:22:32] local.DEBUG: socket(1): ← receiving data:{"id":1225,"result":{}}
[2018-05-01 12:22:32] local.DEBUG: socket(1): ← receiving data:{"method":"Target.receivedMessageFromTarget","params":{"sessionId":"7F79B983E5C187E30BEB765896EF7210:97","message":"{\"id\":1224,\"result\":{\"result\":{\"type\":\"string\",\"value\":\"_gat_UA-69658807-1=$
[2018-05-01 12:22:32] local.DEBUG: socket(1): → sending data:{"id":1226,"method":"Target.closeTarget","params":{"targetId":"7F79B983E5C187E30BEB765896EF7210"}}
[2018-05-01 12:22:32] local.DEBUG: socket(1): → sending data:{"id":1227,"method":"Target.createTarget","params":{"url":"about:blank"}}
[2018-05-01 12:22:32] local.DEBUG: socket(1): ← receiving data:{"method":"Target.targetInfoChanged","params":{"targetInfo":{"targetId":"7F79B983E5C187E30BEB765896EF7210","type":"page","title":"Our Locations","url":"https://www.google.com/about/locations/?region=europ$
[2018-05-01 12:22:32] local.DEBUG: socket(1): ← receiving data:{"method":"Target.detachedFromTarget","params":{"sessionId":"7F79B983E5C187E30BEB765896EF7210:97","targetId":"7F79B983E5C187E30BEB765896EF7210"}}
[2018-05-01 12:22:32] local.DEBUG: socket(1): ← receiving data:{"method":"Target.targetDestroyed","params":{"targetId":"7F79B983E5C187E30BEB765896EF7210"}}
[2018-05-01 12:22:32] local.DEBUG: socket(1): ← receiving data:{"id":1226,"result":{"success":true}}
[2018-05-01 12:22:32] local.DEBUG: connection: ⇶ dispatching method:Target.targetInfoChanged
[2018-05-01 12:22:32] local.DEBUG: connection: ⇶ dispatching method:Target.detachedFromTarget
[2018-05-01 12:22:32] local.DEBUG: connection: ⇶ dispatching method:Target.targetDestroyed
[2018-05-01 12:22:32] local.DEBUG: ✘ session(7F79B983E5C187E30BEB765896EF7210:97) was destroyed and unreferenced.
[2018-05-01 12:22:32] local.DEBUG: ✘ target(7F79B983E5C187E30BEB765896EF7210) was destroyed and unreferenced.
[2018-05-01 12:22:32] local.DEBUG: socket(1): ← receiving data:{"method":"Target.targetCreated","params":{"targetInfo":{"targetId":"4B441422FA18DE9DC0D5B4809892F3AA","type":"page","title":"","url":"","attached":false}}}
[2018-05-01 12:22:32] local.DEBUG: socket(1): ← receiving data:{"id":1227,"result":{"targetId":"4B441422FA18DE9DC0D5B4809892F3AA"}}
[2018-05-01 12:22:32] local.DEBUG: connection: ⇶ dispatching method:Target.targetCreated
[2018-05-01 12:22:32] local.DEBUG: socket(1): → sending data:{"id":1228,"method":"Target.attachToTarget","params":{"targetId":"4B441422FA18DE9DC0D5B4809892F3AA"}}
[2018-05-01 12:22:32] local.DEBUG: socket(1): ← receiving data:{"method":"Target.targetInfoChanged","params":{"targetInfo":{"targetId":"4B441422FA18DE9DC0D5B4809892F3AA","type":"page","title":"","url":"about:blank","attached":true}}}
[2018-05-01 12:22:32] local.DEBUG: connection: ⇶ dispatching method:Target.targetInfoChanged
[2018-05-01 12:22:32] local.DEBUG: socket(1): ← receiving data:{"method":"Target.attachedToTarget","params":{"sessionId":"4B441422FA18DE9DC0D5B4809892F3AA:98","targetInfo":{"targetId":"4B441422FA18DE9DC0D5B4809892F3AA","type":"page","title":"","url":"about:blank","at$
[2018-05-01 12:22:32] local.DEBUG: socket(1): ← receiving data:{"id":1228,"result":{"sessionId":"4B441422FA18DE9DC0D5B4809892F3AA:98"}}
[2018-05-01 12:22:32] local.DEBUG: connection: ⇶ dispatching method:Target.attachedToTarget
[2018-05-01 12:22:32] local.DEBUG: socket(1): → sending data:{"id":1230,"method":"Target.sendMessageToTarget","params":{"message":"{\"id\":1229,\"method\":\"Page.getFrameTree\",\"params\":[]}","sessionId":"4B441422FA18DE9DC0D5B4809892F3AA:98"}}
[2018-05-01 12:22:32] local.DEBUG: socket(1): ← receiving data:{"method":"Target.targetInfoChanged","params":{"targetInfo":{"targetId":"4B441422FA18DE9DC0D5B4809892F3AA","type":"page","title":"about:blank","url":"about:blank","attached":true}}}
[2018-05-01 12:22:32] local.DEBUG: connection: ⇶ dispatching method:Target.targetInfoChanged
[2018-05-01 12:22:32] local.DEBUG: socket(1): ← receiving data:{"id":1230,"result":{}}
[2018-05-01 12:22:32] local.DEBUG: socket(1): ← receiving data:{"method":"Target.receivedMessageFromTarget","params":{"sessionId":"4B441422FA18DE9DC0D5B4809892F3AA:98","message":"{\"id\":1229,\"result\":{\"frameTree\":{\"frame\":{\"id\":\"4B441422FA18DE9DC0D5B4809892$
[2018-05-01 12:22:32] local.DEBUG: socket(1): → sending data:{"id":1232,"method":"Target.sendMessageToTarget","params":{"message":"{\"id\":1231,\"method\":\"Page.enable\",\"params\":[]}","sessionId":"4B441422FA18DE9DC0D5B4809892F3AA:98"}}
[2018-05-01 12:22:32] local.DEBUG: socket(1): ← receiving data:{"id":1232,"result":{}}
[2018-05-01 12:22:32] local.DEBUG: socket(1): ← receiving data:{"method":"Target.receivedMessageFromTarget","params":{"sessionId":"4B441422FA18DE9DC0D5B4809892F3AA:98","message":"{\"id\":1231,\"result\":{}}","targetId":"4B441422FA18DE9DC0D5B4809892F3AA"}}
[2018-05-01 12:22:32] local.DEBUG: socket(1): → sending data:{"id":1234,"method":"Target.sendMessageToTarget","params":{"message":"{\"id\":1233,\"method\":\"Page.setLifecycleEventsEnabled\",\"params\":{\"enabled\":true}}","sessionId":"4B441422FA18DE9DC0D5B4809892F3AA$
[2018-05-01 12:22:32] local.DEBUG: socket(1): ← receiving data:{"id":1234,"result":{}}
[2018-05-01 12:22:32] local.DEBUG: socket(1): ← receiving data:{"method":"Target.receivedMessageFromTarget","params":{"sessionId":"4B441422FA18DE9DC0D5B4809892F3AA:98","message":"{\"method\":\"Page.lifecycleEvent\",\"params\":{\"frameId\":\"4B441422FA18DE9DC0D5B48098$
[2018-05-01 12:22:32] local.DEBUG: socket(1): ← receiving data:{"method":"Target.receivedMessageFromTarget","params":{"sessionId":"4B441422FA18DE9DC0D5B4809892F3AA:98","message":"{\"method\":\"Page.lifecycleEvent\",\"params\":{\"frameId\":\"4B441422FA18DE9DC0D5B48098$
[2018-05-01 12:22:32] local.DEBUG: socket(1): ← receiving data:{"method":"Target.receivedMessageFromTarget","params":{"sessionId":"4B441422FA18DE9DC0D5B4809892F3AA:98","message":"{\"method\":\"Page.lifecycleEvent\",\"params\":{\"frameId\":\"4B441422FA18DE9DC0D5B48098$
[2018-05-01 12:22:32] local.DEBUG: session(4B441422FA18DE9DC0D5B4809892F3AA:98): ⇶ dispatching method:Page.lifecycleEvent
[2018-05-01 12:22:32] local.DEBUG: session(4B441422FA18DE9DC0D5B4809892F3AA:98): ⇶ dispatching method:Page.lifecycleEvent
[2018-05-01 12:22:32] local.DEBUG: session(4B441422FA18DE9DC0D5B4809892F3AA:98): ⇶ dispatching method:Page.lifecycleEvent
[2018-05-01 12:22:32] local.DEBUG: socket(1): ← receiving data:{"method":"Target.receivedMessageFromTarget","params":{"sessionId":"4B441422FA18DE9DC0D5B4809892F3AA:98","message":"{\"method\":\"Page.lifecycleEvent\",\"params\":{\"frameId\":\"4B441422FA18DE9DC0D5B48098$
[2018-05-01 12:22:32] local.DEBUG: socket(1): ← receiving data:{"method":"Target.receivedMessageFromTarget","params":{"sessionId":"4B441422FA18DE9DC0D5B4809892F3AA:98","message":"{\"method\":\"Page.lifecycleEvent\",\"params\":{\"frameId\":\"4B441422FA18DE9DC0D5B48098$
[2018-05-01 12:22:32] local.DEBUG: socket(1): ← receiving data:{"method":"Target.receivedMessageFromTarget","params":{"sessionId":"4B441422FA18DE9DC0D5B4809892F3AA:98","message":"{\"id\":1233,\"result\":{}}","targetId":"4B441422FA18DE9DC0D5B4809892F3AA"}}
[2018-05-01 12:22:32] local.DEBUG: session(4B441422FA18DE9DC0D5B4809892F3AA:98): ⇶ dispatching method:Page.lifecycleEvent
[2018-05-01 12:22:32] local.DEBUG: session(4B441422FA18DE9DC0D5B4809892F3AA:98): ⇶ dispatching method:Page.lifecycleEvent

this is the bottom portion of the log and when its running in a queue it outputs this

[2018-05-01 12:21:32][aXqqXjG85a409mME0Ys4rrwidO5mTGZ0] Processing: App\Jobs\Scan
Killed
gsouf commented 6 years ago

Hi @agentmarine I think this is the same probleme, probably memory again. Sorry for the delay to look into this issue, currently in hollyday, I will do my best to check it out quickly

gsouf commented 6 years ago

Hi @agentmarine Are you still having the same problems with latest version of the project?

agentmarine commented 6 years ago

I dont believe so, based on the way I got it set up where I used a single instance for all the tabs, with appropriate use of closing tabs when finished with them and then terminating the chrome executable when I was finished with them managed to get it all resolved.

gsouf commented 6 years ago

Ok, thanks for your feedback. I'm closing the issue then. Feel free to reopen it if it occurs again under those conditions.