chrome-php / chrome

Instrument headless chrome/chromium instances from PHP
MIT License
2.25k stars 276 forks source link

Chrome process stopped before startup completed #520

Open MarcPinnell opened 1 year ago

MarcPinnell commented 1 year ago

Trying to get this running so I can convert html files. When I try to convert I get the above error message (sorry not very informative!). Any suggestions?

enricodias commented 1 year ago

See #125, #209, #261 and #303.

T313C0mun1s7 commented 1 year ago

Why did you close this? There is no indication it is solved.

I am working on this with Marc, and none of your linked issues have anything to do with this.

What we are seeing is the running a command in the shell works, but it also outputs several warnings/errors, and attempting to run via a web server interpreting it via PHP fails. However, if I run the same PHP file from the cli, it runs just fine. In addition, we have attempted to create a bash script to run the command, since it works from the cli, and use shell_exec() to run the script, and that too fails. We have even gone as far as running the PHP file that runs the bash script via shell_exec() from the cli, and it works. It only fails if a web server is interpreting the php file.

Here is an example of that last scenario working:

╭╴ HOST: wvm-cattle | USER: root | PATH: /home/cattle.prime42.dev/public_html 
╰──╢ 01:09 PM ║ #» /usr/local/lsws/lsphp81/bin/php ./pdftest.php
[0520/130952.286046:WARNING:bluez_dbus_manager.cc(247)] Floss manager not present, cannot set Floss enable/disable.
[0520/130952.398026:WARNING:sandbox_linux.cc(393)] InitializeSandbox() called with multiple threads in process gpu-process.
[0520/130952.410761:ERROR:command_buffer_proxy_impl.cc(128)] ContextResult::kTransientFailure: Failed to send GpuControl.CreateCommandBuffer.
2865478 bytes written to file /home/cattle.prime42.dev/public_html/test.pdf

Here is an example of the command in the bash script runnign diectly from the shell working. Note the warnings/errors about freedesktop and SUID sandbox that should not be there as we are running headless and with no sandbox. It is as if Chrome is simply ignoring those options:

╭╴ HOST: webdev | USER: root | PATH: ~ 
╰──╢ 12:53 PM ║ #» /opt/google/chrome/google-chrome --headless=new --no-sandbox --print-to-pdf --print-to-pdf="/home/sites.local/cattle.local/public/assets/app/_generatedfiles/yetAnotherTest.pdf" /home/sites.local/cattle.local/public/assets/app/_generatedfiles/cattle_test.html
[658783:658892:0520/140133.899586:ERROR:object_proxy.cc(623)] Failed to call method: org.freedesktop.DBus.Properties.Get: object_path= /org/freedesktop/UPower: org.freedesktop.DBus.Error.ServiceUnknown: The name org.freedesktop.UPower was not provided by any .service files
[658783:658892:0520/140133.900496:ERROR:object_proxy.cc(623)] Failed to call method: org.freedesktop.UPower.GetDisplayDevice: object_path= /org/freedesktop/UPower: org.freedesktop.DBus.Error.ServiceUnknown: The name org.freedesktop.UPower was not provided by any .service files
[658783:658892:0520/140133.901417:ERROR:object_proxy.cc(623)] Failed to call method: org.freedesktop.UPower.EnumerateDevices: object_path= /org/freedesktop/UPower: org.freedesktop.DBus.Error.ServiceUnknown: The name org.freedesktop.UPower was not provided by any .service files
263135 bytes written to file /home/sites.local/cattle.local/public/assets/app/_generatedfiles/yetAnotherTest.pdf
[0520/140140.489769:ERROR:nacl_helper_linux.cc(355)] NaCl helper process running without a sandbox!
Most likely you need to configure your SUID sandbox correctly

I can't really tell what is happening when we let the web server attempt the same command in PHP via shell_cmd() as it just does a core dump. No matter how wew aproach this, it works from from the shell, either directly or via PHP but fails when PHP is processes by the web server.

T313C0mun1s7 commented 1 year ago

I will also add that this behaviour is identical on two different web servers. One on bare metal, running AlmaLinux 8.8 and CyberPanel (OpenLitespeed) using PHP 8.1 and the other on AWS EC2 instance, AlmaLinux 8.7 and CyperPanel (OpenLitespeed). The paths are different between the two servers, but not much else is.

It doesn't matter if we run as root or web server user. Root generates an extra error or two, but the web server user has all the required permissions. I have tested that extensively. I also made the chrome binary available via sudo without password and tried that. The only differnce that made was the owner of the generated PDF file was root and needed to be changed to the web server user. It still generated the file when run from shell and fell on its face if attempted from the web server.

enricodias commented 1 year ago

I closed it because it was the same error from the other issues with no new information, but I can leave it open to see if somebody helps.

Your script should behave the same both in the command line and in the web server. If it doesn't, the web server is not executing the same binary and/or loading the same configs and permissions. The web server can run the script with another user, or change the uid/gid of the php process during execution (mod_ruid) or even use an external process to execute the php (fpm, cgi).

T313C0mun1s7 commented 1 year ago

I have ensured that the user used by the web server is the user I used from the shell in testing. It didn't change anything. Also we have compared the php.ini used by the web server with the one used by the cli call to php, and they were the same. We are calling the Chrome binary with the full path to the binary, so they too are the same. In addition I made sure the php binary used by the web server and the cli were the same binary at the same path.

Really don't know what else to check that would make the environment differt. My wild guess is that the shell ignores the warnings and errors and continutes, but the web server sees those and halts on error. We have yet to be able to get it to run without outputing errors, including errors that I feel should not be there due to our useage of --headless=new and --no-sandbox

kuito commented 1 year ago

I am experiencing the exact same thing.

I have the initial php script that when I run it from CLI it works fine. In browser, nope—times out. I did add the debug-logger.. then attempted to run via a second php file using passthru() — also system(), exec(), etc.

At the end the error I see is: chrome.DEBUG: socket(1): ← receiving data:{"method":"Inspector.targetCrashed","params":{},"sessionId":"6EDD1DA249F87054992D9F284FF15B80"} [] [] [2023-09-07T04:14:09.128130+00:00] chrome.DEBUG: session(6EDD1DA249F87054992D9F284FF15B80): ⇶ dispatching method:Inspector.targetCrashed [] [] [2023-09-07T04:14:09.128531+00:00] chrome.DEBUG: socket(1): ← receiving data:{"method":"Target.targetCrashed","params":{"targetId":"F57DFA56B8F892DC7645F33CD8374F6E","status":"crashed","errorCode":133}} [] [] [2023-09-07T04:14:09.128590+00:00] chrome.DEBUG: connection: ⇶ dispatching method:Target.targetCrashed [] [] [2023-09-07T04:14:39.162329+00:00] chrome.DEBUG: process: killing chrome [] [] [2023-09-07T04:14:39.162447+00:00] chrome.DEBUG: process: trying to close chrome gracefully [] [] [2023-09-07T04:14:39.162489+00:00] chrome.DEBUG: socket(1): → sending data:{"id":12,"method":"Browser.close","params":{}} [] [] [2023-09-07T04:14:39.163329+00:00] chrome.DEBUG: socket(1): ← receiving data:{"id":12,"result":{}} [] [] [2023-09-07T04:14:39.163525+00:00] chrome.DEBUG: socket(1): disconnecting [] [] [2023-09-07T04:14:39.163995+00:00] chrome.DEBUG: socket(1): ✓ disconnected [] [] [2023-09-07T04:14:39.164057+00:00] chrome.DEBUG: socket(1): disconnecting [] [] [2023-09-07T04:14:39.164087+00:00] chrome.DEBUG: socket(1): ✗ could not disconnect [] [] [2023-09-07T04:14:39.164115+00:00] chrome.DEBUG: process: waiting for process to close [] [] [2023-09-07T04:14:39.199897+00:00] chrome.DEBUG: process: cleaning temporary

the relevant bit probably being: {"method":"Target.targetCrashed","params":{"targetId":"F57DFA56B8F892DC7645F33CD8374F6E","status":"crashed","errorCode":133}}

I am running this on a pretty clean install of CentOS 9 + php8.0 (but have tested with other versions)

Ive done quite a bit to try to solve this:

i am probably missing something, but it continues to work on the CLI but not in the browser.

I think it may have something to do with Apache not being able to connect using chromium — whereas when ran from the CLI from my user… it works every time.

I am really hoping someone can point something else out to try because I think I’m stumped.

Much appreciated!

pdrhlik commented 8 months ago

Hi @enricodias @T313C0mun1s7 @kuito,

I'm experiencing the same issue. I have probably figured out the reason but I don't know how to solve this yet. My issue is that I'm calling $page->navigate($url)->waitForNavigation(); on the web which is running on the same server as the $url that I'm trying to visit. The waitForNavigation call is blocking any subsequent calls to the server from the same process. It's basically waiting in a loop and it dies after the default timeout. If I navigate to a page that is not on my server, everything works well.

That's probably the reason why it works for most of you using CLI and doesn't work on through the web browser. They are different processes that don't block each other.

So if anyone has any ideas on how to approach this, it would be very appreciated.

Cheers, Patrik

enricodias commented 8 months ago

@pdrhlik that shouldn't be an issue since a new request will create a separate process/thread. Your php process executing $page->navigate($url) is not the same answering the request, even if $url triggers the same script.

pdrhlik commented 8 months ago

That's what I thought at first but it seems to be behaving like this. I don't have any other explanation for that. Maybe it's because I'm using php-fpm, it doesn't start a new process but reuses the same one?

Anyway, I ended up handling the screenshot functionality using a Go library https://github.com/sensepost/gowitness. I build the command in PHP and calling it through shell_exec. It's non blocking and also seems quite efficient.

But thanks for your input anyway!

enricodias commented 8 months ago

php-fpm wouldn't change that behaviour either, this is just not how php works. The http server raises a php process/thread on each request and this process doesn't receive the context of any previous requests. Maybe your script is locking a resource like a file or a db row, cause a deadlock.