little-big-h / node-launchd

A Node.js Addon to get sockets from launchd upon (port-activated) launching on osx
5 stars 2 forks source link

launchd doesn't launch my app #1

Open ryandesign opened 9 years ago

ryandesign commented 9 years ago

Hello, thanks for this module; it's just what I was looking for. Unfortunately I cannot get it to work at all. I can launchctl load -w the plist (which specifies SockServiceName as 3000), but when I access http://localhost:3000 in a web browser, it says it cannot connect to the server. Activity Monitor does not show any node processes running.

I'm unsure how to debug this to figure out what, if anything, is happening. I tried setting the Debug key to true in the plist, but this only causes launchd to say The Debug key is no longer respected. Please remove it.

I'm running node-launchd 0.0.3 in node 0.10.33 on OS X 10.10 Yosemite.

ryandesign commented 9 years ago

As a debugging tactic, I set the RunAtLoad key to true. When I then loaded the job, it elicited an immediate crash from node, which looked like this:

Process:               node [4388]
Path:                  /opt/local/bin/node
Identifier:            node
Version:               0
Code Type:             X86-64 (Native)
Parent Process:        ??? [1]
Responsible:           node [4388]
User ID:               502

Date/Time:             2014-10-26 06:49:03.238 -0500
OS Version:            Mac OS X 10.10 (14A388b)
Report Version:        11
Anonymous UUID:        A7ACF2CC-76F2-8D83-0F47-D1D925232A18

Sleep/Wake UUID:       5A2ED7EB-9063-4502-838D-5B4C484750FA

Time Awake Since Boot: 78000 seconds
Time Since Wake:       3100 seconds

Crashed Thread:        0  Dispatch queue: com.apple.main-thread

Exception Type:        EXC_BAD_INSTRUCTION (SIGILL)
Exception Codes:       0x0000000000000001, 0x0000000000000000

Application Specific Information:
XPC API Misuse: Attempt to access an out-of-bounds index.

Application Specific Signatures:
API Misuse

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   libxpc.dylib                    0x00007fff8d0af850 _xpc_api_misuse + 75
1   libxpc.dylib                    0x00007fff8d0a38a9 xpc_array_get_value + 68
2   checkin.node                    0x0000000108787aa6 Method(v8::Arguments const&) + 364
3   node                            0x0000000107bc82a2 v8::internal::Builtin_HandleApiCall(v8::internal::(anonymous namespace)::BuiltinArguments<(v8::internal::BuiltinExtraArguments)1>, v8::internal::Isolate*) + 459
4   ???                             0x00003f562f70618e 0 + 69639395631502
5   ???                             0x00003f562f765d84 0 + 69639396023684
6   ???                             0x00003f562f72851e 0 + 69639395771678
7   ???                             0x00003f562f7655d0 0 + 69639396021712
8   ???                             0x00003f562f760365 0 + 69639396000613
9   ???                             0x00003f562f75c476 0 + 69639395984502
10  ???                             0x00003f562f74a1e7 0 + 69639395910119
11  ???                             0x00003f562f749bab 0 + 69639395908523
12  ???                             0x00003f562f72d139 0 + 69639395791161
13  ???                             0x00003f562f72c6c5 0 + 69639395788485
14  ???                             0x00003f562f7245e7 0 + 69639395755495
15  ???                             0x00003f562f7118b7 0 + 69639395678391
16  node                            0x0000000107beefa3 v8::internal::Invoke(bool, v8::internal::Handle<v8::internal::JSFunction>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*, bool*) + 389
17  node                            0x0000000107ba6ca7 v8::Function::Call(v8::Handle<v8::Object>, int, v8::Handle<v8::Value>*) + 403
18  node                            0x0000000107dc4041 node::Load(v8::Handle<v8::Object>) + 277
19  node                            0x0000000107dc4c1b node::Start(int, char**) + 350
20  libdyld.dylib                   0x00007fff832885c9 start + 1
ryandesign commented 9 years ago

I've traced the code in gdb, and it's in launch_data_array_get_index that xpc_array_get_value gets called, which leads to _xpc_api_misuse.

Your code assumes that the value you've stored in listening_fd_array is a LAUNCH_DATA_ARRAY but I'm thinking under 10.10 it isn't.

I note that the 10.9 version of launchd.plist(5) says:

At check-in time, the value of each Sockets dictionary key will be an array of descriptors.

But the 10.10 version of that manpage does not contain that sentence.

A further difference is that the 10.10 version says:

The job must check-in to get a copy of the file descriptors using the launch_activate_sockets(3) API.

Although the manpage says "launch_activate_sockets"; the function is actually called launch_activate_socket and was apparently introduced in 10.10. Perhaps we would have more success using that function when it's available.

Alternately, or maybe in addition, use a more defensive programming strategy and always check with launch_data_get_type what type of data you have received. See for example the find_fds function in this Apple code which has code to correctly handle LAUNCH_DATA_ARRAY and LAUNCH_DATA_DICTIONARY data structures no matter where they occur.

ryandesign commented 9 years ago

I was chasing a bit of a red herring. I implemented using launch_activate_socket instead, and was surprised that it was returning zero sockets.

I then realized that when using RunAtLoad, launchd doesn't pass a socket, because there isn't one. That should have been obvious to me, but in the absence of any other feedback I was distracted by the crash. So there is still a bug that node-launchd causes a crash when there are no sockets. This should be fixed by checking for more error conditions.

I finally found the reason why launchd wasn't launching my script, which is a bug in the sample plist in your readme. It shows:

            <key>SockPassive</key>
            <string>false</string>

SockPassive is a boolean key; it accepts only true or false, not a string. I tried using boolean false, which didn't work either. With boolean true, launchd does launch my script successfully and I can connect to it in a web browser:

            <key>SockPassive</key>
            <true/>

Since true is the default for SockPassive, these two lines should just be removed from the sample plist for simplicity.

Then there's SockServiceName:

            <key>SockServiceName</key>
            <string>8080</string>

SockServiceName can be either a string, in which case it is the well-known service name, or it can be an integer, in which case it is the port number. So these lines should be changed to:

            <key>SockServiceName</key>
            <integer>8080</integer>

The SockType lines should be removed, because stream is the default:

            <key>SockType</key>
            <string>stream</string>

SockFamily could be removed too:

            <key>SockFamily</key>
            <string>IPv4</string>

The manpage doesn't say what the default value is, and using IPv4 works, but so does omitting the key, which is what the plists provided with OS X for system services do.