timothyej / Shortfin

Shortfin is a very fast and lightweight open-source web server that serves static content only.
GNU General Public License v2.0
170 stars 8 forks source link

Segfault: Shortfin cannot allocate memory #3

Closed X4 closed 8 years ago

X4 commented 11 years ago
sudo systemctl start shortfin.service
/u/l/s/system ❯❯❯ sudo journalctl -xn
-- Logs begin at Di 2013-08-06 17:49:42 CEST, end at Mi 2013-08-07 17:41:15 CEST. --
Aug 07 17:41:14 SGC-Abydoss.local kill[18420]: -h, --help     display this help and exit
Aug 07 17:41:14 SGC-Abydoss.local kill[18420]: -V, --version  output version information and exit
Aug 07 17:41:14 SGC-Abydoss.local kill[18420]: For more details see kill(1).
Aug 07 17:41:14 SGC-Abydoss.local systemd[1]: shortfin.service: control process exited, code=exited status=1
Aug 07 17:41:14 SGC-Abydoss.local kernel: shortfin[18418]: segfault at 0 ip 000000000040228c sp 00007fff00aea4f8 error 6 in shortfin[400000+9000]
Aug 07 17:41:14 SGC-Abydoss.local kernel: shortfin[18419]: segfault at 0 ip 000000000040228c sp 0000000002043638 error 6 in shortfin[400000+9000]
Aug 07 17:41:14 SGC-Abydoss.local kernel: shortfin[18421]: segfault at 0 ip 000000000040228c sp 0000000002062b78 error 6 in shortfin[400000+9000]
Aug 07 17:41:14 SGC-Abydoss.local systemd[1]: Unit shortfin.service entered failed state.
Aug 07 17:41:15 SGC-Abydoss.local sudo[18426]: X4 : TTY=pts/1 ; PWD=/usr/lib64/systemd/system ; USER=root ; COMMAND=/usr/bin/journalctl -xn
Aug 07 17:41:15 SGC-Abydoss.local sudo[18426]: pam_unix(sudo:session): session opened for user root by X4(uid=0)

 shortfin -c ./config/shortfin.conf                                                                                     ⏎ ✱ ◼
ERROR mlockall: Cannot allocate memory
 * Loading configuration file './config/shortfin.conf'...
 * Loading virtual servers...
ERROR opening config file: No such file or directory
    - 0 virtual server(s) where found.
 * Worker process #1 is started.
 * Starting keep-alive clean-up thread for worker #1.
 * Starting heartbeat thread for worker #1.
 * Starting keep-alive clean-up...
 * Starting heartbeat monitor thread.
 * The server is up and running!
 * Starting keep-alive clean-up...
^C[1]    18750 segmentation fault  shortfin -c ./config/shortfin.conf

shortfin -c /usr/local/etc/shortfin/shortfin.conf                                                                        ✱ ◼
ERROR mlockall: Cannot allocate memory
 * Loading configuration file '/usr/local/etc/shortfin/shortfin.conf'...
 * Daemonizing...
ghost commented 11 years ago

Does this problem only replicate when running as a systemd service? Can you get a core dump and look at a backtrace?

timothyej commented 11 years ago

Shortin uses mlockall() (master_server.c:146) to lock its memory into physical RAM to prevent it from using the swap file. Maybe it fails to allocate the memory due to some limit (http://sanketpadawe.blogspot.se/2012/06/prevent-page-locking-when-using.html)?

ghost commented 11 years ago

I think the mlockall() warning is a red herring. The master server continues on, and should behave fine without mlockall() succeeding, as far as I can see. @X4 I believe by default systemd throws corefiles into the journal, and you can access them with systemd-coredumpctl.

I have some reservations about shortfin using mlockall() at all. Especially MCL_FUTURE | MCL_CURRENT. That is expensive. I'd rely on the OS to properly handle paging.

X4 commented 11 years ago

It also happens when I run shortfin without systemd.

X4 commented 11 years ago

systemd-coredumpctl ✱ ◼ No coredumps found

I don't have coredumps enabled, but I posted two different cases, one segfaults because I ^C, the other just keeps shortfin running in the background or foreground depending on the daemonize flag. In all cases I can't access shortfin on localhost:88/

./shortfin -c shortfin.conf
ERROR mlockall: Cannot allocate memory
 * Loading configuration file 'shortfin.conf'...
 * Daemonizing...
ghost commented 11 years ago

I looked into this a bit this morning. The segfault when shortfin's not running in daemon mode, and you ^C is due to the signal handler using uninitialized data. As far as I can tell, the data the signal handler uses is never initialized, so that's likely the root cause in daemon mode, too.

Also, I suspect the failure to load virtual server configuration is the root cause of shortfin not responding on the configured port. As it exists now, there are two points where the configuration is loaded, and I suspect after the first load shortfin does a chdir(), which causes the second config load to fail.

ghost commented 11 years ago

Oh, I forgot to mention that also while debugging I discovered that shortfin is leaking shared memory segments, which is not good because after awhile shmget() fails due to -ENOSPC. I'm still trying to wrap my head around the worker thread model...

X4 commented 11 years ago

Congrats :+1: "That was a really productive morning @joshcartwright" Glad you found the cause of it.

I'm still having fun with all kinds of webservers, like monkeyhttpd,nxweb,lwan,gwan,nginx,shortfin,lighttpd,webmachine and many many more.

ghost commented 11 years ago

One way to test out my chdir() theory would be to give shortfin the absolute path to your configuration file.