pallets / werkzeug

The comprehensive WSGI web application library.
https://werkzeug.palletsprojects.com
BSD 3-Clause "New" or "Revised" License
6.65k stars 1.73k forks source link

Performance regression starting with werkzeug 2.3.0 #2925

Closed omikader closed 3 months ago

omikader commented 3 months ago

I am in the process of upgrading a Flask app and noticed a performance regression on the Werkzeug development server starting with version 2.3.0. To confirm, I created a simple app based on the OpenTelemetry guide and ran it as a Docker container.

from random import randint
from flask import Flask, request
from werkzeug.middleware.profiler import ProfilerMiddleware
import logging

app = Flask(__name__)
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

@app.route("/rolldice")
def roll_dice():
    player = request.args.get('player', default=None, type=str)
    result = str(roll())
    if player:
        logger.warning("%s is rolling the dice: %s", player, result)
    else:
        logger.warning("Anonymous player is rolling the dice: %s", result)
    return result

def roll():
    return randint(1, 6)

if __name__ == "__main__":
    app.wsgi_app = ProfilerMiddleware(app.wsgi_app, restrictions=[30])
    app.run(host="0.0.0.0", port=8100)

I used Apache Benchmark to simulate requests on the same application running versions 2.2.3 and 2.3.0 of werkzeug, respectively. The former can handle 479.17 requests per second and the latter only handles 53.59.

% ab -n 100 http://127.0.0.1:8100/rolldice
This is ApacheBench, Version 2.3 <$Revision: 1903618 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient).....done

Server Software:        Werkzeug/2.2.3
Server Hostname:        127.0.0.1
Server Port:            8100

Document Path:          /rolldice
Document Length:        1 bytes

Concurrency Level:      1
Time taken for tests:   0.209 seconds
Complete requests:      100
Failed requests:        0
Total transferred:      17300 bytes
HTML transferred:       100 bytes
Requests per second:    479.17 [#/sec] (mean)
Time per request:       2.087 [ms] (mean)
Time per request:       2.087 [ms] (mean, across all concurrent requests)
Transfer rate:          80.95 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.1      0       0
Processing:     1    2   1.4      2      13
Waiting:        1    2   1.4      2      13
Total:          1    2   1.4      2      14

Percentage of the requests served within a certain time (ms)
  50%      2
  66%      2
  75%      2
  80%      2
  90%      3
  95%      5
  98%      6
  99%     14
 100%     14 (longest request)
% ab -n 100 http://127.0.0.1:8100/rolldice
This is ApacheBench, Version 2.3 <$Revision: 1903618 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient).....done

Server Software:        Werkzeug/2.3.0
Server Hostname:        127.0.0.1
Server Port:            8100

Document Path:          /rolldice
Document Length:        1 bytes

Concurrency Level:      1
Time taken for tests:   1.866 seconds
Complete requests:      100
Failed requests:        0
Total transferred:      17300 bytes
HTML transferred:       100 bytes
Requests per second:    53.59 [#/sec] (mean)
Time per request:       18.660 [ms] (mean)
Time per request:       18.660 [ms] (mean, across all concurrent requests)
Transfer rate:          9.05 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.1      0       1
Processing:    15   18   1.9     18      32
Waiting:        4    7   1.8      7      22
Total:         15   19   2.0     18      32

Percentage of the requests served within a certain time (ms)
  50%     18
  66%     19
  75%     19
  80%     20
  90%     20
  95%     21
  98%     22
  99%     32
 100%     32 (longest request)

I added the the ProfilerMiddleware but didn't notice anything out of the ordinary.

On 2.2.3

PATH: '/rolldice'
         466 function calls (461 primitive calls) in 0.000 seconds

   Ordered by: internal time, call count
   List reduced from 202 to 30 due to restriction <30>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        2    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/werkzeug/local.py:147(pop)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/flask/ctx.py:225(pop)
       65    0.000    0.000    0.000    0.000 {built-in method builtins.isinstance}
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/werkzeug/routing/map.py:246(bind_to_environ)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/logging/__init__.py:288(__init__)
        1    0.000    0.000    0.000    0.000 {method 'flush' of '_io.TextIOWrapper' objects}
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/flask/ctx.py:394(pop)
        9    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/werkzeug/local.py:160(top)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/werkzeug/wrappers/request.py:106(__init__)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/werkzeug/middleware/profiler.py:102(runapp)
        7    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/typing.py:255(inner)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/werkzeug/routing/matcher.py:67(match)
        2    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/werkzeug/datastructures.py:1134(set)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/flask/app.py:2021(wsgi_app)
        5    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/werkzeug/datastructures.py:21(iter_multi_items)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/flask/ctx.py:354(push)
        4    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/flask/app.py:2071(_request_blueprints)
        1    0.000    0.000    0.000    0.000 ./app.py:11(roll_dice)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/werkzeug/wrappers/response.py:478(get_wsgi_headers)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/werkzeug/sansio/response.py:110(__init__)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/werkzeug/sansio/response.py:165(_clean_status)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/werkzeug/wsgi.py:476(__init__)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/flask/app.py:1616(make_response)
        8    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/werkzeug/datastructures.py:840(_unicodify_header_value)
        7    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/werkzeug/_internal.py:130(_to_str)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/werkzeug/routing/map.py:488(match)
      8/7    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/typing.py:720(__hash__)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/logging/__init__.py:1514(findCaller)
        2    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/werkzeug/local.py:140(push)
      4/1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/werkzeug/routing/matcher.py:77(_match)

On 2.3.0

PATH: '/rolldice'
         422 function calls (418 primitive calls) in 0.001 seconds

   Ordered by: internal time, call count
   List reduced from 200 to 30 due to restriction <30>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.000    0.000 {method 'flush' of '_io.TextIOWrapper' objects}
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/werkzeug/routing/map.py:270(bind_to_environ)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/logging/__init__.py:288(__init__)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/werkzeug/wrappers/request.py:109(__init__)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/werkzeug/sansio/response.py:133(__init__)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/werkzeug/sansio/request.py:201(__init__)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/flask/ctx.py:354(push)
        1    0.000    0.000    0.000    0.000 ./app.py:11(roll_dice)
        1    0.000    0.000    0.001    0.001 /usr/local/lib/python3.8/site-packages/werkzeug/middleware/profiler.py:104(runapp)
      4/1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/werkzeug/routing/matcher.py:79(_match)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/werkzeug/routing/matcher.py:69(match)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/flask/ctx.py:394(pop)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/werkzeug/wsgi.py:258(__init__)
        1    0.000    0.000    0.001    0.001 /usr/local/lib/python3.8/site-packages/flask/app.py:2021(wsgi_app)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/werkzeug/routing/map.py:513(match)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/flask/ctx.py:278(__init__)
        2    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/flask/app.py:1734(create_url_adapter)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/flask/app.py:1616(make_response)
      9/8    0.000    0.000    0.000    0.000 {method 'encode' of 'str' objects}
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/logging/__init__.py:1514(findCaller)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/werkzeug/wrappers/response.py:143(__init__)
        9    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/werkzeug/local.py:158(top)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/werkzeug/sansio/request.py:276(args)
        2    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/werkzeug/datastructures/headers.py:312(set)
       49    0.000    0.000    0.000    0.000 {built-in method builtins.isinstance}
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/inspect.py:171(_has_code_flag)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/werkzeug/wrappers/response.py:439(get_wsgi_headers)
        5    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/werkzeug/datastructures/structures.py:18(iter_multi_items)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/typing.py:802(__getitem__)
        4    0.000    0.000    0.000    0.000 /usr/local/lib/python3.8/site-packages/werkzeug/datastructures/headers.py:499(_str_header_value)

I am aware that the development server provided by Flask is not intended for production use, so benchmarking it is perhaps a contrived exercise -- but I was surprised by these results and I couldn't find anything in the changelog or elsewhere online that explains the difference, hence the issue.

Thank you for the awesome project!

Environment:

davidism commented 3 months ago

Werkzeug 2.3.0 is no longer supported. Try 3.0.x. You're also using Python 3.8, but 3.12 has significant speedups compared to that. And the dev server is not designed to be consistently performant. That said, there's not really anything we can do about this. If you want to use a profiler and figure out optimizations for a PR, that's fine, but it's too broad to just leave open otherwise.

omikader commented 3 months ago

Hi @davidism, thanks for the quick response! I had first noticed the issue when I upgraded to Python 3.12 and Flask/Werkzeug 3.0.3 so I spent a little time trying to isolate the change and landed on Werkzeug 2.2.3 -> 2.3.0. Here are the ab and profiler results for the latest stable versions of Python + Flask, respectively.

% ab -n 100 http://127.0.0.1:8100/rolldice
This is ApacheBench, Version 2.3 <$Revision: 1903618 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient).....done

Server Software:        Werkzeug/3.0.3
Server Hostname:        127.0.0.1
Server Port:            8100

Document Path:          /rolldice
Document Length:        1 bytes

Concurrency Level:      1
Time taken for tests:   1.812 seconds
Complete requests:      100
Failed requests:        0
Total transferred:      17300 bytes
HTML transferred:       100 bytes
Requests per second:    55.20 [#/sec] (mean)
Time per request:       18.117 [ms] (mean)
Time per request:       18.117 [ms] (mean, across all concurrent requests)
Transfer rate:          9.33 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.1      0       1
Processing:    15   18   1.8     17      30
Waiting:        4    6   1.8      6      19
Total:         15   18   1.8     18      31

Percentage of the requests served within a certain time (ms)
  50%     18
  66%     18
  75%     19
  80%     19
  90%     19
  95%     20
  98%     25
  99%     31
 100%     31 (longest request)
PATH: '/rolldice'
         404 function calls (400 primitive calls) in 0.001 seconds

   Ordered by: internal time, call count
   List reduced from 194 to 30 due to restriction <30>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.000    0.000    0.000    0.000 {method 'write' of '_io.TextIOWrapper' objects}
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.12/site-packages/werkzeug/routing/map.py:252(bind_to_environ)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.12/logging/__init__.py:298(__init__)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.12/site-packages/werkzeug/sansio/request.py:159(args)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.12/site-packages/werkzeug/wrappers/request.py:110(__init__)
      4/1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.12/site-packages/werkzeug/routing/matcher.py:79(_match)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.12/site-packages/werkzeug/sansio/response.py:111(__init__)
        1    0.000    0.000    0.001    0.001 /usr/local/lib/python3.12/site-packages/werkzeug/middleware/profiler.py:114(runapp)
        2    0.000    0.000    0.000    0.000 /usr/local/lib/python3.12/site-packages/flask/app.py:401(create_url_adapter)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.12/site-packages/flask/app.py:1233(preprocess_request)
      9/8    0.000    0.000    0.000    0.000 {method 'encode' of 'str' objects}
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.12/site-packages/flask/ctx.py:367(push)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.12/site-packages/flask/ctx.py:396(pop)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.12/site-packages/werkzeug/routing/matcher.py:69(match)
        1    0.000    0.000    0.000    0.000 /backend/./app.py:11(roll_dice)
        1    0.000    0.000    0.001    0.001 /usr/local/lib/python3.12/site-packages/flask/app.py:1441(wsgi_app)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.12/site-packages/flask/app.py:1260(process_response)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.12/site-packages/flask/ctx.py:309(__init__)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.12/site-packages/werkzeug/wsgi.py:233(__init__)
       41    0.000    0.000    0.000    0.000 {built-in method builtins.isinstance}
        5    0.000    0.000    0.000    0.000 /usr/local/lib/python3.12/site-packages/werkzeug/local.py:310(__get__)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.12/site-packages/flask/app.py:1092(make_response)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.12/random.py:242(_randbelow_with_getrandbits)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.12/logging/__init__.py:1541(warning)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.12/site-packages/flask/ctx.py:251(push)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.12/site-packages/werkzeug/sansio/request.py:118(__init__)
        4    0.000    0.000    0.000    0.000 /usr/local/lib/python3.12/site-packages/werkzeug/datastructures/headers.py:457(_str_header_value)
        5    0.000    0.000    0.000    0.000 /usr/local/lib/python3.12/site-packages/werkzeug/_internal.py:29(_wsgi_decoding_dance)
        1    0.000    0.000    0.000    0.000 /usr/local/lib/python3.12/site-packages/flask/app.py:867(full_dispatch_request)

Given that the development server is not meant for production, I agree that it probably makes sense to close.

davidism commented 3 months ago

That doesn't match our benchmarks either, as there were significant speed increases to the code in Werkzeug 3, with the removal of bytes/str dual handling and the old urllib copy.

omikader commented 3 months ago

Thanks for following up @davidism! If it's not too much trouble, I'd love to take a look at your benchmarks so I can try to reproduce but no worries if not. Have a great week!