tornadoweb / tornado

Tornado is a Python web framework and asynchronous networking library, originally developed at FriendFeed.
http://www.tornadoweb.org/
Apache License 2.0
21.75k stars 5.5k forks source link

Limit threads created at startup #2762

Closed loretoparisi closed 5 years ago

loretoparisi commented 5 years ago

When I initialize Tornado like

class WebServer(threading.Thread):
    def run(self):
        PORT = os.getenv('PORT', 8888)
        asyncio.set_event_loop_policy(AnyThreadEventLoopPolicy())
        app = application()
        app.listen(PORT)
        tornado.ioloop.IOLoop.instance().start()

I see a number of threads created at startup, before the first api call:

  689 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:33.89  `- python3 tornadoaas.py                                                                        
  729 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:00.07      `- python3 tornadoaas.py                                                                    
  730 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:00.00      `- python3 tornadoaas.py                                                                    
  731 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:00.00      `- python3 tornadoaas.py                                                                    
  732 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:00.00      `- python3 tornadoaas.py                                                                    
  733 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:00.00      `- python3 tornadoaas.py                                                                    
  734 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:00.00      `- python3 tornadoaas.py                                                                    
  735 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:00.00      `- python3 tornadoaas.py                                                                    
  736 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:00.00      `- python3 tornadoaas.py                                                                    
  737 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:00.00      `- python3 tornadoaas.py                                                                    
  738 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:00.00      `- python3 tornadoaas.py                                                                    
  739 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:00.00      `- python3 tornadoaas.py                                                                    
  741 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:00.00      `- python3 tornadoaas.py                                                                    
  742 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:00.00      `- python3 tornadoaas.py                                                                    
  743 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:00.00      `- python3 tornadoaas.py                                                                    
  744 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:00.00      `- python3 tornadoaas.py                                                                    
  745 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:00.00      `- python3 tornadoaas.py                                                                    
  746 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:00.00      `- python3 tornadoaas.py                                                                    
  747 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:00.00      `- python3 tornadoaas.py                                                                    
  748 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:00.00      `- python3 tornadoaas.py                                                                    
  749 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:00.00      `- python3 tornadoaas.py                                                                    
  750 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:00.00      `- python3 tornadoaas.py                                                                    
  751 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:00.00      `- python3 tornadoaas.py                                                                    
  752 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:00.00      `- python3 tornadoaas.py                                                                    
  753 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:00.00      `- python3 tornadoaas.py                                                                    
  754 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:00.00      `- python3 tornadoaas.py                                                                    
  755 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:00.00      `- python3 tornadoaas.py                                                                    
  756 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:00.00      `- python3 tornadoaas.py                                                                    
  757 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:00.00      `- python3 tornadoaas.py                                                                    
  758 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:00.00      `- python3 tornadoaas.py                                                                    
  759 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:00.07      `- python3 tornadoaas.py                                                                    
  760 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:22.85      `- python3 tornadoaas.py                                                                    
  761 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:23.00      `- python3 tornadoaas.py                                                                    
  762 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:00.02      `- python3 tornadoaas.py                                                                    
  763 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:00.02      `- python3 tornadoaas.py                                                                    
  764 root      20   0 9027376   2.1g 287324 S   0.0  13.1   0:00.02      `- python3 tornadoaas.py 

When the first api call is sent, all those threads get the same memory occupation like

    1 root      20   0   27244  21312   7864 S   0.0   0.1   0:00.45 /usr/bin/python2 /usr/bin/supervisord                                                            
    8 root      20   0  966536 143588  36936 S   0.3   0.9   0:01.71  `- python3 dash/app.py                                                                          
  689 root      20   0 9255584   2.1g 288516 S   0.0  13.6   0:33.89  `- python3 tornadoaas.py                                                                        
  729 root      20   0 9255584   2.1g 288516 S   0.0  13.6   0:00.07      `- python3 tornadoaas.py                                                                    
  730 root      20   0 9255584   2.1g 288516 S   0.0  13.6   0:00.00      `- python3 tornadoaas.py                                                                    
  731 root      20   0 9255584   2.1g 288516 S   0.0  13.6   0:00.00      `- python3 tornadoaas.py                                                                    
  732 root      20   0 9255584   2.1g 288516 S   0.0  13.6   0:00.00      `- python3 tornadoaas.py                                                                    
  733 root      20   0 9255584   2.1g 288516 S   0.0  13.6   0:00.00      `- python3 tornadoaas.py                                                                    
  734 root      20   0 9255584   2.1g 288516 S   0.0  13.6   0:00.00      `- python3 tornadoaas.py                                                                    
  735 root      20   0 9255584   2.1g 288516 S   0.0  13.6   0:00.00      `- python3 tornadoaas.py                                                                    
  736 root      20   0 9255584   2.1g 288516 S   0.0  13.6   0:00.00      `- python3 tornadoaas.py                                                                    
  737 root      20   0 9255584   2.1g 288516 S   0.0  13.6   0:00.00      `- python3 tornadoaas.py                                                                    
  738 root      20   0 9255584   2.1g 288516 S   0.0  13.6   0:00.00      `- python3 tornadoaas.py                                                                    
  739 root      20   0 9255584   2.1g 288516 S   0.0  13.6   0:00.00      `- python3 tornadoaas.py                                                                    
  741 root      20   0 9255584   2.1g 288516 S   0.0  13.6   0:00.00      `- python3 tornadoaas.py                                                                    
  742 root      20   0 9255584   2.1g 288516 S   0.0  13.6   0:00.00      `- python3 tornadoaas.py       
...
bdarnell commented 5 years ago

The only case in which Tornado creates threads by default is for DNS resolution. In older versions of Python the limit for the number of threads created this way was pretty high, but it was reduced in Python 3.8.

You can set the limit with

 asyncio.get_event_loop().set_default_executor(concurrent.futures.ThreadPoolExecutor(max_workers=8))
loretoparisi commented 5 years ago

@bdarnell thanks a lot. So this is my code (may help someone else in future):

class WebServer(threading.Thread):
    def run(self):
        PORT = os.getenv('PORT', 8888)
        N_CPU = multiprocessing.cpu_count()
        asyncio.set_event_loop_policy(AnyThreadEventLoopPolicy())
        asyncio.get_event_loop().set_default_executor(concurrent.futures.ThreadPoolExecutor(max_workers=N_CPU))
        app = application()
        app.listen(PORT)
        tornado.ioloop.IOLoop.instance().start()
WebServer().start()

NOTE This line asyncio.set_event_loop_policy(AnyThreadEventLoopPolicy()) is necessary, otherwise the IOLoop will act differently on different OS (like Debian, Ubuntu or macOS/Windows).

loretoparisi commented 5 years ago

@bdarnell I have just noticed that even setting asyncio.get_event_loop().set_default_executor(ThreadPoolExecutor(max_workers=N_CPU)) with N_CPU = 8 I get

top - 15:17:27 up  5:29,  0 users,  load average: 0.32, 1.21, 1.20
Threads:  60 total,   1 running,  59 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  16024.9 total,   3686.5 free,   2239.0 used,  10099.4 buff/cache
MiB Swap:   4096.0 total,   4096.0 free,      0.0 used.  13454.8 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                                                          
    1 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:52.51 python3 tornadoaas.py                                                                            
   66 root      20   0    7260   2056   1844 S   0.0   0.0   0:00.00  `- /usr/sbin/cron                                                                               
   84 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.07  `- python3 tornadoaas.py                                                                        
   85 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.00  `- python3 tornadoaas.py                                                                        
   86 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.00  `- python3 tornadoaas.py                                                                        
   87 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.00  `- python3 tornadoaas.py                                                                        
   88 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.00  `- python3 tornadoaas.py                                                                        
   89 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.00  `- python3 tornadoaas.py                                                                        
   90 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.00  `- python3 tornadoaas.py                                                                        
   91 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.00  `- python3 tornadoaas.py                                                                        
   92 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.00  `- python3 tornadoaas.py                                                                        
   93 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.00  `- python3 tornadoaas.py                                                                        
   94 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.00  `- python3 tornadoaas.py                                                                        
   96 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.00  `- python3 tornadoaas.py                                                                        
   97 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.00  `- python3 tornadoaas.py                                                                        
   98 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.00  `- python3 tornadoaas.py                                                                        
   99 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.00  `- python3 tornadoaas.py                                                                        
  100 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.00  `- python3 tornadoaas.py                                                                        
  101 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.00  `- python3 tornadoaas.py                                                                        
  102 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.00  `- python3 tornadoaas.py                                                                        
  103 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.00  `- python3 tornadoaas.py                                                                        
  104 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.00  `- python3 tornadoaas.py                                                                        
  105 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.00  `- python3 tornadoaas.py                                                                        
  106 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.00  `- python3 tornadoaas.py                                                                        
  107 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.00  `- python3 tornadoaas.py                                                                        
  108 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.00  `- python3 tornadoaas.py                                                                        
  109 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.01  `- python3 tornadoaas.py                                                                        
  110 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.00  `- python3 tornadoaas.py                                                                        
  111 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.00  `- python3 tornadoaas.py                                                                        
  112 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.00  `- python3 tornadoaas.py                                                                        
  113 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.00  `- python3 tornadoaas.py                                                                        
  114 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.00  `- python3 tornadoaas.py                                                                        
  115 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:26.67  `- python3 tornadoaas.py                                                                        
  116 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:26.13  `- python3 tornadoaas.py                                                                        
  124 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.11  `- python3 tornadoaas.py                                                                        
  125 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.11  `- python3 tornadoaas.py                                                                        
  126 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.08  `- python3 tornadoaas.py                                                                        
  127 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.09  `- python3 tornadoaas.py                                                                        
  128 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.12  `- python3 tornadoaas.py                                                                        
  129 root      20   0 9031000   2.1g 294712 S   0.0  13.2   0:00.11  `- python3 tornadoaas.py
... 
ploxiln commented 5 years ago

If you are using AnyThreadEventLoopPolicy() then it is plausible that you are creating multiple ioloops on multiple threads, and it is plausible that one of the ioloops on a different thread than the one that called asyncio.get_event_loop().set_default_executor() is doing lots of concurrent dns lookups.

I suggest being very explicit and deliberate about which threads are running event loops, instead of using AnyThreadEventLoopPolicy(), to debug and fix where you are unexpectedly using different threads.

loretoparisi commented 5 years ago

@ploxiln good point, thank you. I will go through non default executors in the handlers then. I can see all these threads being allocated at startup, i.e. just after tornado.ioloop.IOLoop.instance().start(), so before any call to handlers. I have a base handler for cpu intensive tasks like

class BaseHandler(web.RequestHandler):
    @tornado.gen.coroutine
    def get_async(self, *args):
        resp_dict = yield ioloop.IOLoop.current().run_in_executor(executor, self.get_handler, args)
        return resp_dict

    @tornado.gen.coroutine
    def get(self, *args):
        response = yield self.get_async(args)
        self.set_header("Content-Type", "application/json; charset=UTF-8")
        status_code = response.getField('status_code', 200)
        self.set_status(status_code)
        self.write(json.dumps(response.getData()))

from which some handlers inherits.