bitwiseworks / qtwebengine-chromium-os2

Port of Chromium and related tools to OS/2
9 stars 2 forks source link

Network Service in OOP mode: FATAL in platform_shared_memory_region_os2.cc #42

Closed dmik closed 3 years ago

dmik commented 3 years ago

After having fixed #41, I get this in release build in multi-process mode (both browser and renderer):

[2656:1:0723/213040.252000:FATAL:platform_shared_memory_region_os2.cc(27)] Check failed: CheckPlatformHandlePermissionsCorrespondToMode(handle.get(), mode, size). 

Killed by SIGABRT
pid=0x0a60 ppid=0x0a5e tid=0x0001 slot=0x009e pri=0x0200 mc=0x0001 ps=0x0010
D:\CODING\QT5\QT5-DEV-BUILD\QTBASE\LIBEXEC\QTWEBENGINEPROCESS.EXE
Creating 0A60_01.TRP
[2654:7:0723/213042.387000:FATAL:platform_shared_memory_region_os2.cc(27)] Check failed: CheckPlatformHandlePermissionsCorrespondToMode(handle.get(), mode, size). 

Killed by SIGABRT
pid=0x0a5e ppid=0x0a5d tid=0x0007 slot=0x0094 pri=0x0200 mc=0x0001 ps=0x0010
D:\CODING\QT5\QT5-DEV-BUILD\QTWEBENGINE\EXAMPLES\WEBENGINEWIDGETS\SIMPLEBROWSER\RELEASE\SIMPLEBROWSER.EXE
Creating 0A5E_07.TRP

Trap files are rather useless in release builds because EXCEPTQ can't decode stack frames due to optimization. And of course I don't see this in debug builds.

dmik commented 3 years ago

No, I am not right about the debug builds. They crash as well. Why they didn't crash before is because of --enable-features=NetworkServiceInProcess in QTWEBENGINE_CHROMIUM_FLAGS which makes the new network service work in-process rather than have another child process for it (see https://github.com/bitwiseworks/qtwebengine-chromium-os2/issues/41#issuecomment-863974769 for more info).

Adding this flag into the release build fixes the above FATAL crash there. I will rename the ticket.

dmik commented 3 years ago

I guess I know why it fails. Passing LIBCx handles between processes requires either the sender process to know the receiving process PID or the receiver process know the sending process PID. Channel communication in Mojo is done so that by the time when the message is about to be sent via a socket pipe to the other party, only the parent knows its child's PID — the child does NOT know the parent PID. So, when the parent sends LIBCx handles to children, it calls libcx_send_handles before writing them to the pipe to give the child access to them and the child will be able to use handles right away once it receives them. However, when the child sends LIBCx handles to the parent, it doesn't know its PID and can't call libcx_send_handles: it just writes them to the pipe. And the parent, when it gets them, first needs to call libcx_take_handles to get access to them before it can use these handles. In fact, I had to introduce this send/take logic just because of that.

However, putting Network Service in a separate child process adds one more party that exchanges handles. Normal renderers don' t talk to each other, they only talk to the main browser process. But the Network Service does talk not only to the browser process but also to renderers. And since they are both children, none of them knows each other's PID and hence none of them actually calls libcx_send_handles or libcx_take_handles while exchanging LIBCx handles. As a result, when one side gets such a handle it can't use it and hence we get a FATAL message shown above.

I'm not sure what's best here. The only thing I have in mind is to additionally send the originating PID for all handles to the socket pipe if the receiver's PID is not known (and hence libcx_send_handles can't be called). The receiving party will just extract this PID from the message and call libcx_take_handles on that PID. I will look into that.

dmik commented 3 years ago

The above idea with sending PID over the socket pipe along with handles worked great, Network Service as a separate process seems to work now. A commit will follow.