cloudius-systems / osv

OSv, a new operating system for the cloud.
osv.io
Other
4.11k stars 605 forks source link

applications started via REST API are never deleted #843

Open justinc1 opened 7 years ago

justinc1 commented 7 years ago

While trying to debug #824, I noticed this minor problem with apps started via httpserver. The app is started by osv::run(), and app object is never joined. So after about 500 cycles (the test code used to reproduce #824 - it does GET /os/version, POST /env/, PUT /app/?command=myapp.so), the server VM stopped responding (hitting enter key didn't show command prompt in console). I guess VM used most of memory for mmapping myapp.so.

This is not really a problem for me (I need to start less than 10 apps), but at least it is worth to mention it.

gdb:

(gdb) osv info thr
...
 121 (0xffff800002b9b040) zio_read_intr   cpu0 running    LZ4_uncompress_unknownOutputSize at bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/lz4.c:995 vruntime  7.28092e-07
...
189 (0xffff800002e96040) zil_clean       cpu0 waiting    synch_port::_msleep(void*, mtx*, int, char const*, int) at bsd/porting/synch.cc:104 vruntime  3.18051e-22
 196 (0xffff800002f3d040) dhcp            cpu0 waiting    dhcp::dhcp_worker::dhcp_worker_fn() at core/dhcp.cc:765 vruntime  2.42898e-19
 197 (0xffff800003929040) /libhttpserver. cpu0 waiting    waiter::wait(sched::timer*) const at include/osv/wait_record.hh:47 vruntime   1.0352e-07
 198 (0xffff800003938040) timerfd         cpu0 waiting    waiter::wait(sched::timer*) const at include/osv/wait_record.hh:47 vruntime  2.81241e-22
 199 (0xffff800003c1f040) /cli/cli.so     cpu0 waiting    waiter::wait(sched::timer*) const at include/osv/wait_record.hh:47 vruntime  1.06101e-07
 200 (0xffff800003ef0040) /http_client.so cpu0 terminated ?? at arch/x64/entry.S:113 vruntime  7.99421e+08
 201 (0xffff800003ff6040) /http_client.so cpu0 terminated ?? at arch/x64/entry.S:113 vruntime  1.43708e+09
 202 (0xffff8000040fc040) /http_client.so cpu0 terminated ?? at arch/x64/entry.S:113 vruntime  2.80413e+09
...
1930 (0xffff8000031d9040) /http_client.so cpu0 terminated ?? at arch/x64/entry.S:113 vruntime  5.22651e+08
1931 (0xffff8000032d8040) /http_client.so cpu0 terminated ?? at arch/x64/entry.S:113 vruntime  1.50457e+09
1932 (0xffff80000360d040) /http_client.so cpu0 terminated ?? at arch/x64/entry.S:113 vruntime   2.9573e+09
Number of threads: 1850

(gdb) bt
#0  0x00000000003c32be in LZ4_uncompress_unknownOutputSize (source=0xffff800003944004 "\204\177ELF\002\001\001", dest=0xffff8000735d0000 "\177ELF\002\001\001", isize=66711, maxOutputSize=131072) at bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/lz4.c:995
#1  0x00000000003c1db6 in lz4_decompress (s_start=0xffff800003944000, d_start=0xffff8000735d0000, s_len=67072, d_len=131072, n=0) at bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/lz4.c:90
#2  0x00000000003c054c in zio_decompress_data (c=ZIO_COMPRESS_LZ4, src=0xffff800003944000, dst=0xffff8000735d0000, s_len=67072, d_len=131072) at bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio_compress.c:135
#3  0x00000000003b9643 in zio_decompress (zio=0xffffa000039a4c00, data=0xffff8000735d0000, size=131072) at bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:376
#4  0x00000000003b9516 in zio_pop_transforms (zio=0xffffa000039a4c00) at bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:344
#5  0x00000000003bf31d in zio_done (zio=0xffffa000039a4c00) at bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:2905
#6  0x00000000003bb7ec in zio_execute (zio=0xffffa000039a4c00) at bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1260
#7  0x00000000003b9b98 in zio_notify_parent (pio=0xffffa000039a4c00, zio=0xffffa00072b42400, wait=ZIO_WAIT_DONE) at bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:522
#8  0x00000000003bf959 in zio_done (zio=0xffffa00072b42400) at bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3088
#9  0x00000000003bb7ec in zio_execute (zio=0xffffa00072b42400) at bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1260
#10 0x00000000002f61d0 in taskq_run_safe (arg=0xffffa00072b427b0, pending=1) at bsd/sys/cddl/compat/opensolaris/kern/opensolaris_taskq.c:147
#11 0x000000000025252e in taskqueue_run_locked (queue=0xffffa000027c6e00) at bsd/sys/kern/subr_taskqueue.c:297
#12 0x0000000000252adb in taskqueue_thread_loop (arg=0xffffa00001d3d858) at bsd/sys/kern/subr_taskqueue.c:466
#13 0x0000000000265008 in __lambda1::operator() (__closure=0xffffa00001e7c630) at bsd/porting/kthread.cc:33
#14 0x00000000002658dd in std::_Function_handler<void(), kthread_add(void (*)(void*), void*, proc*, thread**, int, int, char const*, ...)::__lambda1>::_M_invoke(const std::_Any_data &) (__functor=...) at /usr/include/c++/4.8.2/functional:2071
#15 0x000000000044add6 in std::function<void ()>::operator()() const (this=0xffff800002b9b070) at /usr/include/c++/4.8.2/functional:2471
#16 0x00000000005be2b4 in sched::thread::main (this=0xffff800002b9b040) at core/sched.cc:1207
#17 0x00000000005ba1e7 in sched::thread_main_c (t=0xffff800002b9b040) at arch/x64/arch-switch.hh:164
#18 0x0000000000489b16 in thread_main () at arch/x64/entry.S:113
(gdb) osv thr 197
(gdb) bt
#0  sched::thread::switch_to (this=0xffff800002b9b040) at arch/x64/arch-switch.hh:75
#1  0x00000000005bad36 in sched::cpu::reschedule_from_interrupt (this=0xffff8000017d1040, called_from_yield=false, preempt_after=...) at core/sched.cc:339
#2  0x00000000005ba59a in sched::cpu::schedule () at core/sched.cc:228
#3  0x00000000005be2d1 in sched::thread::wait (this=0xffff800003929040) at core/sched.cc:1213
#4  0x000000000056b092 in sched::thread::do_wait_until<sched::noninterruptible, sched::thread::dummy_lock, waiter::wait(sched::timer*) const::{lambda()#1}>(sched::thread::dummy_lock&, waiter::wait(sched::timer*) const::{lambda()#1}) (mtx=..., pred=...) at include/osv/sched.hh:1052
#5  0x000000000056ae12 in sched::thread::wait_until<waiter::wait(sched::timer*) const::{lambda()#1}>(waiter::wait(sched::timer*) const::{lambda()#1}) (pred=...) at include/osv/sched.hh:1063
#6  0x000000000056adda in waiter::wait (this=0xffff80000392cb70, tmr=0x0) at include/osv/wait_record.hh:47
#7  0x0000000000569e5b in condvar::wait (this=0xffffa00072b41730, user_mutex=0xffffa00072b41708, tmr=0x0) at core/condvar.cc:43
#8  0x000000000056a201 in condvar_wait (condvar=0xffffa00072b41730, user_mutex=0xffffa00072b41708, expiration=0) at core/condvar.cc:171
#9  0x00000000003bb872 in zio_wait (zio=0xffffa00072b41400) at bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1288
#10 0x000000000032b442 in dmu_buf_hold_array_by_dnode_sparse (dn=0xffff900002d46000, offset=749568, length=4096, read=1, tag=0xa1115f <__func__.12820>, numbufsp=0xffff80000392cd84, dbpp=0xffff80000392cd88, flags=0, fail_sparse=1) at bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:439
#11 0x000000000032b68c in dmu_buf_hold_array_sparse (os=0xffff900002a9c000, object=268, offset=749568, length=4096, read=1, tag=0xa1115f <__func__.12820>, numbufsp=0xffff80000392cd84, dbpp=0xffff80000392cd88, fail_sparce=1) at bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:489
#12 0x000000000032c80c in dmu_map_uio (os=0xffff900002a9c000, object=268, uio=0xffff80000392ce40, size=4096) at bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:1007
#13 0x00000000003a83aa in zfs_arc (vp=0xffffa00001d6b800, fp=0xffffa00001d6b600, uio=0xffff80000392ce40) at bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:753
#14 0x00000000006674f4 in vfs_file::get_arcbuf (this=0xffffa00001d6b600, key=0xffff80000392cfc0, offset=749568) at fs/vfs/vfs_fops.cc:175
#15 0x0000000000572486 in pagecache::create_read_cached_page (fp=0xffffa00001d6b600, key=...) at core/pagecache.cc:388
#16 0x0000000000572a64 in pagecache::get (fp=0xffffa00001d6b600, offset=749568, ptep=..., pte=..., write=false, shared=false) at core/pagecache.cc:470
#17 0x00000000006673f1 in vfs_file::map_page (this=0xffffa00001d6b600, off=749568, ptep=..., pte=..., write=false, shared=false) at fs/vfs/vfs_fops.cc:143
#18 0x00000000003ccbe1 in mmu::map_file_page_mmap::map (this=0xffffa00002670d20, offset=749568, ptep=..., pte=..., write=false) at core/mmu.cc:1095
#19 0x00000000003f6bbd in mmu::populate<(mmu::account_opt)1>::page<0> (this=0xffff80000392dae0, ptep=..., offset=749568) at core/mmu.cc:533
#20 0x00000000003f761e in mmu::populate_small<(mmu::account_opt)1>::page<0> (this=0xffff80000392dae0, ptep=..., offset=749568) at core/mmu.cc:551
#21 0x00000000003cb065 in mmu::page<mmu::populate_small<(mmu::account_opt)1>, 0> (pops=..., ptep=..., offset=749568) at core/mmu.cc:312
#22 0x00000000003f5a26 in mmu::map_level<mmu::populate_small<(mmu::account_opt)1>, 1>::operator() (this=0xffff80000392d420, parent=..., base_virt=17592199376896) at core/mmu.cc:446
#23 0x00000000003f4089 in mmu::map_level<mmu::populate_small<(mmu::account_opt)1>, 2>::map_range<1> (this=0xffff80000392d630, vcur=17592199376896, size=4096, page_mapper=..., slop=4096, ptep=..., base_virt=17592198627328) at core/mmu.cc:400
#24 0x00000000003f071d in mmu::map_level<mmu::populate_small<(mmu::account_opt)1>, 2>::operator() (this=0xffff80000392d630, parent=..., base_virt=17592198627328) at core/mmu.cc:450
#25 0x00000000003ecfff in mmu::map_level<mmu::populate_small<(mmu::account_opt)1>, 3>::map_range<2> (this=0xffff80000392d840, vcur=17592199376896, size=4096, page_mapper=..., slop=4096, ptep=..., base_virt=17592186044416) at core/mmu.cc:400
#26 0x00000000003e787d in mmu::map_level<mmu::populate_small<(mmu::account_opt)1>, 3>::operator() (this=0xffff80000392d840, parent=..., base_virt=17592186044416) at core/mmu.cc:450
#27 0x00000000003e2865 in mmu::map_level<mmu::populate_small<(mmu::account_opt)1>, 4>::map_range<3> (this=0xffff80000392da50, vcur=17592199376896, size=4096, page_mapper=..., slop=4096, ptep=..., base_virt=17592186044416) at core/mmu.cc:400
#28 0x00000000003dcd69 in mmu::map_level<mmu::populate_small<(mmu::account_opt)1>, 4>::operator() (this=0xffff80000392da50, parent=..., base_virt=17592186044416) at core/mmu.cc:450
#29 0x00000000003d8b1c in mmu::map_range<mmu::populate_small<(mmu::account_opt)1> > (vma_start=17592198627328, vstart=17592199376896, size=4096, page_mapper=..., slop=4096) at core/mmu.cc:355
#30 0x00000000003d4089 in mmu::operate_range<mmu::populate_small<(mmu::account_opt)1> > (mapper=..., vma_start=0x100000c00000, start=0x100000cb7000, size=4096) at core/mmu.cc:807
#31 0x00000000003d12b5 in mmu::vma::operate_range<mmu::populate_small<(mmu::account_opt)1> > (this=0xffffa00002ab7e80, mapper=..., addr=0x100000cb7000, size=4096) at core/mmu.cc:1413
#32 0x00000000003ce4b3 in mmu::populate_vma<(mmu::account_opt)1> (vma=0xffffa00002ab7e80, v=0x100000cb7000, size=4096, write=false) at core/mmu.cc:1209
#33 0x00000000003c6a1a in mmu::file_vma::fault (this=0xffffa00002ab7e80, addr=17592199376896, ef=0xffff80000392e068) at core/mmu.cc:1703
#34 0x00000000003c5777 in mmu::vm_fault (addr=17592199376896, ef=0xffff80000392e068) at core/mmu.cc:1342
#35 0x0000000000489d37 in page_fault (ef=0xffff80000392e068) at arch/x64/mmu.cc:38
#36 <signal handler called>
#37 0x0000100000c5e294 in http::server::request_parser::consume (this=this@entry=0xffff8000034c3128, req=..., input=<optimized out>) at request_parser.cc:30
#38 0x0000100000c5c18f in parse<char*> (end=0xffff8000034c10cc "", begin=0xffff8000034c1089 "ET /os/version HTTP/1.0\r\nHost: 192.168.122.90:8000\r\nAccept: */*\r\n\r\n", req=..., this=0xffff8000034c3128) at request_parser.hh:58
#39 operator() (__closure=<optimized out>, bytes_transferred=<optimized out>, ec=...) at connection.cc:255
#40 std::_Function_handler<void(boost::system::error_code, long unsigned int), http::server::connection::do_read()::__lambda3>::_M_invoke(const std::_Any_data &, boost::system::error_code, unsigned long) (__functor=..., __args#0=..., __args#1=68) at /usr/include/c++/4.8.2/functional:2071
#41 0x0000100000c83655 in operator() (__args#1=<optimized out>, __args#0=..., this=0x2000001ff400) at /usr/include/c++/4.8.2/functional:2471
#42 operator() (this=0x2000001ff400) at /usr/include/boost/asio/detail/bind_handler.hpp:118
#43 asio_handler_invoke<boost::asio::detail::binder2<std::function<void(boost::system::error_code, long unsigned int)>, boost::system::error_code, long unsigned int> > (function=...) at /usr/include/boost/asio/handler_invoke_hook.hpp:64
#44 invoke<boost::asio::detail::binder2<std::function<void(boost::system::error_code, long unsigned int)>, boost::system::error_code, long unsigned int>, std::function<void(boost::system::error_code, long unsigned int)> > (context=..., function=...) at /usr/include/boost/asio/detail/handler_invoke_helpers.hpp:39
#45 boost::asio::detail::reactive_socket_recv_op<boost::asio::mutable_buffers_1, std::function<void (boost::system::error_code, unsigned long)> >::do_complete(boost::asio::detail::task_io_service*, boost::asio::detail::task_io_service_operation*, boost::system::error_code const&, unsigned long) (owner=<optimized out>, base=0xffffa00003926600) at /usr/include/boost/asio/detail/reactive_socket_recv_op.hpp:110
#46 0x0000100000c7b38f in complete (bytes_transferred=<optimized out>, ec=..., owner=..., this=0xffffa00003926600) at /usr/include/boost/asio/detail/task_io_service_operation.hpp:37
#47 do_run_one (ec=..., this_thread=..., lock=..., this=0xffffa000033b6a00) at /usr/include/boost/asio/detail/impl/task_io_service.ipp:412
#48 boost::asio::detail::task_io_service::run (this=0xffffa000033b6a00, ec=...) at /usr/include/boost/asio/detail/impl/task_io_service.ipp:153
#49 0x0000100000c78196 in run (this=0xffffa00003573b80) at /usr/include/boost/asio/impl/io_service.ipp:59
#50 http::server::server::run (this=0xffffa00003573b80) at server.cc:100
#51 0x0000100000c7ebb4 in httpserver::global_server::run (_config=...) at global_server.cc:92
#52 0x0000100000c5122e in main (argc=<optimized out>, argv=<optimized out>) at main.cc:48
#53 0x00000000006243bc in osv::application::run_main (this=0xffffa00002d48b18, path="/libhttpserver.so", argc=1, argv=0xffffa00001e7c7a0) at core/app.cc:338
#54 0x00000000006245ed in osv::application::run_main (this=0xffffa00002d48b18) at core/app.cc:355
#55 0x00000000006240c0 in osv::application::main (this=0xffffa00002d48b18) at core/app.cc:283
#56 0x0000000000623af2 in osv::application::__lambda8::operator() (__closure=0x0, app=0xffffa00002d48b18) at core/app.cc:201
#57 0x0000000000623b17 in osv::application::__lambda8::_FUN (app=0xffffa00002d48b18) at core/app.cc:203
#58 0x000000000068a237 in pthread_private::pthread::pthread(void* (*)(void*), void*, sigset_t, pthread_private::thread_attr const*)::{lambda()#1}::operator()() const () at libc/pthread.cc:114
#59 0x000000000068cd49 in std::_Function_handler<void(), pthread_private::pthread::pthread(void* (*)(void*), void*, sigset_t, const pthread_private::thread_attr*)::__lambda6>::_M_invoke(const std::_Any_data &) (__functor=...) at /usr/include/c++/4.8.2/functional:2071
#60 0x000000000044add6 in std::function<void ()>::operator()() const (this=0xffff800003929070) at /usr/include/c++/4.8.2/functional:2471
#61 0x00000000005be2b4 in sched::thread::main (this=0xffff800003929040) at core/sched.cc:1207
#62 0x00000000005ba1e7 in sched::thread_main_c (t=0xffff800003929040) at arch/x64/arch-switch.hh:164
#63 0x0000000000489b16 in thread_main () at arch/x64/entry.S:113
(gdb) osv memory 
Total Memory: 2147081216 Bytes
Mmap Memory:  1828925440 Bytes (85.18%)
Free Memory:  212668416 Bytes (9.91%)
(gdb) 
nyh commented 7 years ago

Running the same shared object with osv::run(std::vector), as httpserver does, would not map or load the same object twice, as it would reuse the object if still in memory. So it seems the problem is, like you also suggested, with unjoined threads: In the "osv info threads" you listed about (thanks!), you can see there are almost 2000 dead threads - these threads have neither been "join()", nor marked "detached" (which would make it unnecessary to wait for them).

The problem with all these dead threads is that their stack is still alive. Our default pthread stack size is 1MB (for non-pthread threads, we have a lower default, 64 KB). So two thousand of these dead threads with 1MB stacks will indeed fill the memory.

I'll see why these threads are not being marked detached as they should. Additionally I'll open a low priority issue (just so we'll remember) that dead threads could have their stacks be deleted.

nyh commented 7 years ago

@justinc1 I'm starting to suspect that it wasn't httpserver's fault for starting (and not detaching or joining) all these threads, but rather it is your own application which starts a thread and then neither marks it detached nor joins it. Is that possible?

One of the consequences of OSv's lack of isolated processes, is that OSv cannot offer "cleanup" services when an application's main() exists... If this main() left behind un-free()ed memory, un-close()ed file descriptors, or in this case un-join()ed threads, they will be left and if the same application is repeatedly run, memory will leak until we run out of it. I don't think this is something we can (or want to) fix in OSv.

So please check if your application creates threads and forgets to detach or join them. If not please let me know and I'll continue to investigate.

justinc1 commented 7 years ago

The test app doesn't create any new threads. Simpler testcase (since I just have file already opened).

sleep.c, compiled to /sleep.so:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

int main(int argc, char **argv)
{
  printf("PP In sleep.c main.\n");
  int repeat=5, ii;
  switch(argc) {
    case 2:
      repeat = atoi(argv[1]);
      break;
    case 1:
      break;
    default:
      printf("PP Usage: %s [repeat=5]\n", argv[0]);
      return 0;      
  }

  printf("PP cmd: %s repeat=%d.\n", argv[0], repeat);
  for(ii=0; ii<repeat; ii++) {
    sleep(1);
    fprintf(stderr,"PP   %d/%d ...\n", ii, repeat);
  }
  fprintf(stderr,"PP MAIN DONE\n");
  return 0;
}

Then 5-times

curl -X PUT      http://192.168.122.90:8000/app/?command=%2Fsleep.so%203

And gdb:

 199 (0xffff800003c1f040) /cli/cli.so     cpu0 waiting    console::LineDiscipline::read(uio*, int) at drivers/line-discipline.cc:22 vruntime       134896
 200 (0xffff800003ee7040) /sleep.so       cpu0 terminated ?? at arch/x64/entry.S:113 vruntime  1.20252e-07
 201 (0xffff800003fed040) /sleep.so       cpu0 terminated ?? at arch/x64/entry.S:113 vruntime  0.000363523
 202 (0xffff8000040f3040) /sleep.so       cpu0 terminated ?? at arch/x64/entry.S:113 vruntime   0.00359571
 203 (0xffff8000041fe040) /sleep.so       cpu0 terminated ?? at arch/x64/entry.S:113 vruntime    0.0381669
 204 (0xffff800004304040) /sleep.so       cpu0 terminated ?? at arch/x64/entry.S:113 vruntime     0.231765
nyh commented 7 years ago

Turns out that in addition to the osv::run implementation run.cc, we have a second (!?) implementation of osv::run with slightly different paramters in app.cc! Unlike the former which runs the command and waits, the second runs it in the background - and never waits for nor detaches it. Moreover, it calls application::run() which stores this application in an apps array and we're supposed to call application::join() to remove it (which we never do). So the thread never gets joined, and also the library never gets unloaded.

Strangely httpserver calls that second osv::run variant. I have to get rid of it and clean up this mes...

justinc1 commented 7 years ago

never waits for nor detaches it

For single-threaded httpserver, the app::run() should not block. So can I guess that correct fix would be that httpserver starts app main() thread as detached?

nyh commented 7 years ago

In this thread - https://groups.google.com/d/msg/osv-dev/6UggBSYElJw/CVdNLlmABgAJ - I explained what causes this problem, and proposed a patch (titled "httpserver / osv::run_background: Don't keep apps alive forever"). The patch isn't quite correct, but it's a good start.