Azure / azure-functions-python-library

Azure Functions Python SDK
MIT License
151 stars 63 forks source link

[WIP] Add ujson as alternative JSON encoder #130

Open tonybaloney opened 2 years ago

tonybaloney commented 2 years ago

The standard library json module is the slowest of the json encoders.

ujson is 10-20x faster at encoding and decoding, especially for large datasets.

This PR moves the json imports into a shim module, which picks the standard library implementation or ujson depending on whether:

vrdmr commented 2 years ago

Any specific reason to choose ujon over orjson?

tonybaloney commented 2 years ago

Any specific reason to choose ujon over orjson?

Supporting StringifyEnum was impossible without using a fork of orjson, which I tried and it was using old bindings for Python.

ujson supports custom type serialisation via a __json__ method in the class, which is going to be more performant. It's also more compatible with json

codecov[bot] commented 2 years ago

Codecov Report

Merging #130 (c090330) into dev (284c15d) will decrease coverage by 0.25%. The diff coverage is 81.81%.

@@            Coverage Diff             @@
##              dev     #130      +/-   ##
==========================================
- Coverage   86.04%   85.79%   -0.26%     
==========================================
  Files          50       51       +1     
  Lines        2903     2922      +19     
  Branches      391      396       +5     
==========================================
+ Hits         2498     2507       +9     
- Misses        329      336       +7     
- Partials       76       79       +3     
Flag Coverage Δ
unittests 85.79% <81.81%> (-0.22%) :arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
azure/functions/_durable_functions.py 68.29% <ø> (ø)
azure/functions/_json.py 72.97% <72.97%> (ø)
azure/functions/_cosmosdb.py 88.88% <100.00%> (ø)
azure/functions/_http.py 91.30% <100.00%> (ø)
azure/functions/_queue.py 84.61% <100.00%> (ø)
azure/functions/_sql.py 100.00% <100.00%> (ø)
azure/functions/cosmosdb.py 74.35% <100.00%> (ø)
azure/functions/decorators/utils.py 100.00% <100.00%> (+2.53%) :arrow_up:
azure/functions/durable_functions.py 83.33% <100.00%> (ø)
azure/functions/eventgrid.py 90.90% <100.00%> (ø)
... and 11 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 284c15d...c090330. Read the comment docs.

tonybaloney commented 2 years ago

benchmark

This is the benchmark between ujson (left) and json (right) for HttpRequest.get_json()

tonybaloney commented 2 years ago

I've deployed 2 Azure Functions in Australiaeast with this patch applied and without the patch applied

The sample POST request is:

{
    "id": "0001",
    "type": "donut",
    "name": "Cake",
    "ppu": 0.55,
    "batters":
        {
            "batter":
                [
                    { "id": "1001", "type": "Regular" },
                    { "id": "1002", "type": "Chocolate" },
                    { "id": "1003", "type": "Blueberry" },
                    { "id": "1004", "type": "Devil's Food" }
                ]
        },
    "topping":
        [
            { "id": "5001", "type": "None" },
            { "id": "5002", "type": "Glazed" },
            { "id": "5005", "type": "Sugar" },
            { "id": "5007", "type": "Powdered Sugar" },
            { "id": "5006", "type": "Chocolate with Sprinkles" },
            { "id": "5003", "type": "Chocolate" },
            { "id": "5004", "type": "Maple" }
        ]
}

The function source code is:

import azure.functions as func
import json

def main(req: func.HttpRequest) -> func.HttpResponse:
    try:
        req_body = req.get_json()
    except ValueError:
        pass

    return func.HttpResponse(
        json.dumps(req_body),
        status_code=200
    )

The script to test the two deployments:

$ ab -p test_data.json -T application/json -n 1000 -c 10 https://ant-functions-load-testing.azurewebsites.net/api/httptriggertest
$ ab -p test_data.json -T application/json -n 1000 -c 10 https://ant-functions-load-testing-og.azurewebsites.net/api/httptriggertest

The results are:

50 66 75 80 90 95 98 99
JSON 114 119 125 128 150 217 345 2113
UJSON 111 116 118 121 126 131 145 175
Normalised JSON 44 49 55 58 80 147 275 2043
Normalised UJSON 41 46 48 51 56 61 75 105

I've subtracted 70ms as this was the mean connect time, so you can more clearly see the difference between the two branches.

10% faster in the 50th percentile, but importantly 2.3x faster in the 95th percentile. (ignore the 99th percentile as this will include coldstart times)

screenshot 2022-05-25 at 18 53 12