dpkp / kafka-python

Python client for Apache Kafka
http://kafka-python.readthedocs.io/
Apache License 2.0
5.59k stars 1.4k forks source link

Upgrade dependency on crc32c to non-optional? #2380

Open rtobar opened 1 year ago

rtobar commented 1 year ago

Hi, thanks for reviving this project, good to see it's getting some renewed maintenance.

Lately our team has been using this package and aiokafka, and while doing some profiling I found they both cater for fast crc32c implementations (optional dependency on crc32c in this package, Cython extension in aiokafka), while also providing python fallbacks. Depending on various factors one can end up going through these fallbacks, which hurts performance. We experienced this already, and it would have gone unnoticed unless we weren't doing this profiling.

This is more a question rather than an issue (the Discussions section isn't enabled, otherwise I would've posted there): what would be your thoughts on declaring crc32c as a non-optional dependency? It's a small package with no dependencies on its own, provides binary wheels via cibuildwheel for all major platforms/archs, so in principle it shouldn't add issues at installation time (and even if compilation is required, only a C compiler is needed). Having it always installed would mean that the chances of getting a slow python-based crc32c impl would be null, and even when no hardware acceleration can be found, the C-based implementation should still beat the python one (I even assume that would be true when running under PyPy, since the hot loop of the checksum doesn't interact with the Python C API).

Just looking for quick feedback at the moment. If the idea is well received I'd be happy to put together a pull request for this.

Full disclosure: I actually am the maintainer of crc32c too, which is why I took a personal interest on this when I saw it during our profiling. However I didn't know that Kafka used crc32c in general, nor that this package depended optionally on crc32c, so seeing this come up was a complete surprise. I also plan to have a similar discussion under aiokafka.

dpkp commented 1 year ago

Historically the kafka-python project philosophy has been to be pure-python and not have any required dependencies. I recognize that the python packaging ecosystem has improved substantially in the past several years, and w/ wheels generally available perhaps we should reconsider. But this is a pretty big change from our current philosophy so not something we should take on lightly.

Right now crc32c is in the "extra_requires" bucket and would be installed for any user that selects extras. I might be more inclined to create a more aggressive set of 'optimizations' extra_requires or improve the documentation on these packages.

rtobar commented 1 year ago

Thanks @dpkp for your swift answer!

[...] project philosophy has been to be pure-python and not have any required dependencies

Although I hadn't read this explicitly, I was guessing this was current reasoning, thanks for confirming.

Also, to make the point explicit: adding a dependency on a C extension module only breaks the rule of kafka-python having no dependencies, but not the only about it being written itself purely in python. After adding the dependency, kafka-python itself would still be purely written in python. It would use a C-written module, sure, but just in the same way that it already uses built-in C-written modules and objects provided by the CPython interpreter.

this is a pretty big change from our current philosophy so not something we should take on lightly.

Agreed. This would be a leap from zero to non-zero dependencies, so I understand it would have to be considered carefully. I wanted to get the ball rolling on having this discussion though, since it would potentially be an easy performance win for many users that might not be aware of the availability of optional requirements.

Right now crc32c is in the "extra_requires" bucket and would be installed for any user that selects extras. I might be more inclined to create a more aggressive set of 'optimizations' extra_requires or improve the documentation on these packages.

This would also a be a good first step. I again understand this aligns better with the current package philosophy, and is a less risky move. If, for example, users see clearly in the readme/documentation that you strongly recommend the installation of these extra packages for performance, that'd be a good result too.

Finally, and for reference, some numbers: kafka-python has some ~250k daily downloads, while crc32c has ~20k (and those are probably not all coming fro kafka-python users), which tells you that there are many kafka-python users that are leaving some performance on the floor.

wbarnha commented 1 year ago

I think retaining the current crc32c implementation is wise, but if users install the other crc32c package, it can defer to using that library automatically.

Edit: Apparently we already do that according to https://kafka-python.readthedocs.io/en/master/install.html but I need to review the code to make sure that is still true.

rtobar commented 1 year ago

Apparently we already do that

Indeed. The discussion is not about adding the dependency as an optional one (it already is) but they make it non-optional. I understand that this is not trivial though, but OTOH there's potential for performance benefits for the vast majority of users that are not installing this optional dependency.

KeatonWakefield01 commented 1 year ago

quick question regarding this conversation, is there already a resource that shows the difference in performance?

rtobar commented 1 year ago

We have a benchmark against aiokafka==0.8.1 for various message sizes and counts, which I just ran with aiokafka using their cython-based crc32c implementation (so not quite as great as using the crc32c package since it supports less architectures and has less acceleration options, but good enough for comparison purposes), and then again letting aiokafka fallback to their crc32c python implementation (which looks very similar to the on in this package) using AIOKAFKA_NO_EXTENSIONS=1.

So it's not exactly what you'd like to see to compare kafka-python's performance with/without the crc32c package, but it's a good proxy.

These are the results with the cython extension:

------------------------------------------------------------------------------------ benchmark 'aiokafka': 42 tests -----------------------------------------------------------------------------------
Name (time in ms)                          Min                Max               Mean            StdDev             Median               IQR            Outliers       OPS            Rounds  Iterations
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_aiokafka_throughput[50-512]        2.5464 (1.0)       5.2740 (1.26)      3.1885 (1.06)     0.5360 (2.09)      3.0273 (1.04)     0.7141 (2.75)         75;7  313.6275 (0.95)        270           1
test_aiokafka_throughput[50-256]        2.5564 (1.00)     52.1538 (12.42)     3.3553 (1.11)     3.5488 (13.87)     2.9222 (1.00)     0.5658 (2.18)          1;4  298.0376 (0.90)        194           1
test_aiokafka_throughput[20-256]        2.5792 (1.01)      6.2704 (1.49)      3.1622 (1.05)     0.6822 (2.67)      2.9169 (1.0)      0.5475 (2.11)        20;13  316.2368 (0.96)        160           1
test_aiokafka_throughput[20-4096]       2.6099 (1.02)      5.9206 (1.41)      3.1749 (1.05)     0.4486 (1.75)      3.0574 (1.05)     0.4896 (1.89)        65;10  314.9704 (0.95)        248           1
test_aiokafka_throughput[20-512]        2.6174 (1.03)      5.3253 (1.27)      3.1143 (1.03)     0.4233 (1.65)      2.9915 (1.03)     0.5139 (1.98)         51;7  321.1004 (0.97)        237           1
test_aiokafka_throughput[1-8192]        2.6430 (1.04)      6.4705 (1.54)      3.8006 (1.26)     0.6636 (2.59)      3.8086 (1.31)     0.7937 (3.06)         72;7  263.1183 (0.79)        234           1
test_aiokafka_throughput[20-2048]       2.6590 (1.04)      6.4980 (1.55)      3.8090 (1.26)     0.8345 (3.26)      3.6829 (1.26)     0.9702 (3.74)        49;10  262.5393 (0.79)        181           1
test_aiokafka_throughput[50-2048]       2.6903 (1.06)      5.3143 (1.27)      3.0213 (1.0)      0.2797 (1.09)      2.9628 (1.02)     0.2897 (1.12)        45;10  330.9843 (1.0)         275           1
test_aiokafka_throughput[50-1024]       2.7018 (1.06)      4.1982 (1.0)       3.1217 (1.03)     0.2559 (1.0)       3.0792 (1.06)     0.2721 (1.05)        65;11  320.3399 (0.97)        258           1
test_aiokafka_throughput[1-4096]        2.7079 (1.06)      5.0269 (1.20)      3.5364 (1.17)     0.4612 (1.80)      3.4427 (1.18)     0.5111 (1.97)         45;6  282.7743 (0.85)        148           1
test_aiokafka_throughput[100-512]       2.7316 (1.07)     12.1163 (2.89)      3.2486 (1.08)     1.2103 (4.73)      3.0136 (1.03)     0.2594 (1.0)          6;18  307.8291 (0.93)        214           1
test_aiokafka_throughput[100-256]       2.7384 (1.08)      5.5310 (1.32)      3.2356 (1.07)     0.4272 (1.67)      3.1302 (1.07)     0.4413 (1.70)        39;13  309.0589 (0.93)        239           1
test_aiokafka_throughput[20-1024]       2.7675 (1.09)      5.7710 (1.37)      3.6033 (1.19)     0.5117 (2.00)      3.4705 (1.19)     0.5817 (2.24)         36;6  277.5271 (0.84)        147           1
test_aiokafka_throughput[20-8192]       2.7832 (1.09)     14.2846 (3.40)      3.7502 (1.24)     1.3186 (5.15)      3.3713 (1.16)     0.9702 (3.74)         10;8  266.6521 (0.81)        223           1
test_aiokafka_throughput[100-1024]      2.7871 (1.09)      6.7990 (1.62)      3.3390 (1.11)     0.7595 (2.97)      3.0807 (1.06)     0.3736 (1.44)        23;25  299.4909 (0.90)        164           1
test_aiokafka_throughput[50-4096]       2.8773 (1.13)      6.1759 (1.47)      3.2976 (1.09)     0.3829 (1.50)      3.2057 (1.10)     0.3561 (1.37)        30;12  303.2549 (0.92)        212           1
test_aiokafka_throughput[1-2048]        2.8810 (1.13)      7.5159 (1.79)      3.7635 (1.25)     0.6678 (2.61)      3.5707 (1.22)     0.7569 (2.92)         40;9  265.7096 (0.80)        186           1
test_aiokafka_throughput[1-1024]        2.9704 (1.17)     43.6303 (10.39)     4.0327 (1.33)     3.1365 (12.26)     3.5654 (1.22)     0.6501 (2.51)          5;9  247.9737 (0.75)        198           1
test_aiokafka_throughput[100-2048]      3.0297 (1.19)      8.3983 (2.00)      3.5322 (1.17)     0.8092 (3.16)      3.2639 (1.12)     0.3607 (1.39)        15;16  283.1109 (0.86)        183           1
test_aiokafka_throughput[50-8192]       3.1379 (1.23)      5.9422 (1.42)      3.6180 (1.20)     0.4252 (1.66)      3.4576 (1.19)     0.4186 (1.61)        35;11  276.3988 (0.84)        235           1
test_aiokafka_throughput[1-512]         3.3163 (1.30)      6.5285 (1.56)      4.5014 (1.49)     0.7471 (2.92)      4.2804 (1.47)     1.0705 (4.13)         36;0  222.1516 (0.67)        119           1
test_aiokafka_throughput[100-4096]      3.3185 (1.30)     48.5615 (11.57)     4.1376 (1.37)     3.2624 (12.75)     3.7240 (1.28)     0.6152 (2.37)         1;11  241.6835 (0.73)        193           1
test_aiokafka_throughput[100-8192]      3.9206 (1.54)     17.2220 (4.10)      8.2149 (2.72)     3.0342 (11.86)     8.8913 (3.05)     5.5484 (21.39)        76;0  121.7293 (0.37)        192           1
test_aiokafka_throughput[1-256]         4.8913 (1.92)      8.0803 (1.92)      6.2037 (2.05)     0.9144 (3.57)      6.1716 (2.12)     1.4287 (5.51)         15;0  161.1942 (0.49)         34           1
test_aiokafka_throughput[200-256]       6.1884 (2.43)     12.5683 (2.99)      7.4988 (2.48)     1.4104 (5.51)      7.0006 (2.40)     1.1926 (4.60)          9;7  133.3541 (0.40)         69           1
test_aiokafka_throughput[200-512]       6.4519 (2.53)     12.4782 (2.97)      7.7151 (2.55)     1.0088 (3.94)      7.4770 (2.56)     1.1645 (4.49)         14;3  129.6168 (0.39)         86           1
test_aiokafka_throughput[200-1024]      6.6897 (2.63)     14.9998 (3.57)      7.9794 (2.64)     1.1736 (4.59)      7.6875 (2.64)     0.8656 (3.34)          8;6  125.3219 (0.38)         73           1
test_aiokafka_throughput[400-256]       7.0418 (2.77)     15.0846 (3.59)      8.6002 (2.85)     1.8022 (7.04)      7.8678 (2.70)     1.5413 (5.94)          6;6  116.2759 (0.35)         65           1
test_aiokafka_throughput[200-2048]      7.1759 (2.82)     38.8734 (9.26)      9.6143 (3.18)     3.7884 (14.81)     8.7125 (2.99)     1.3904 (5.36)         3;11  104.0117 (0.31)         78           1
test_aiokafka_throughput[400-512]       7.3630 (2.89)     14.3709 (3.42)      8.9670 (2.97)     1.6878 (6.60)      8.1951 (2.81)     1.7785 (6.85)         11;5  111.5197 (0.34)         65           1
test_aiokafka_throughput[400-1024]      7.6547 (3.01)     13.2578 (3.16)      9.1821 (3.04)     1.2988 (5.08)      8.7209 (2.99)     1.3127 (5.06)         15;7  108.9072 (0.33)         79           1
test_aiokafka_throughput[800-256]       8.2181 (3.23)     17.7408 (4.23)      9.4827 (3.14)     1.6244 (6.35)      9.0381 (3.10)     0.7639 (2.94)          5;5  105.4553 (0.32)         55           1
test_aiokafka_throughput[200-4096]      8.6050 (3.38)     14.8265 (3.53)     10.3700 (3.43)     1.5323 (5.99)      9.9268 (3.40)     1.6693 (6.43)         20;8   96.4321 (0.29)         86           1
test_aiokafka_throughput[400-2048]      8.9774 (3.53)     16.0428 (3.82)     10.7413 (3.56)     1.2253 (4.79)     10.3790 (3.56)     1.1571 (4.46)         16;3   93.0984 (0.28)         69           1
test_aiokafka_throughput[800-512]       9.3285 (3.66)     14.2023 (3.38)     10.6719 (3.53)     1.1974 (4.68)     10.3103 (3.53)     1.0783 (4.16)          9;5   93.7037 (0.28)         51           1
test_aiokafka_throughput[200-8192]     10.3648 (4.07)     14.4156 (3.43)     12.1626 (4.03)     0.8804 (3.44)     11.9967 (4.11)     1.0761 (4.15)         19;3   82.2195 (0.25)         65           1
test_aiokafka_throughput[800-1024]     10.4380 (4.10)     15.7460 (3.75)     12.0739 (4.00)     0.9951 (3.89)     11.9539 (4.10)     1.1920 (4.59)         15;2   82.8234 (0.25)         63           1
test_aiokafka_throughput[400-4096]     10.9955 (4.32)     17.5879 (4.19)     12.5439 (4.15)     1.0918 (4.27)     12.2303 (4.19)     1.1554 (4.45)          9;3   79.7200 (0.24)         65           1
test_aiokafka_throughput[800-2048]     12.5375 (4.92)     17.4044 (4.15)     13.7876 (4.56)     1.0301 (4.03)     13.4136 (4.60)     1.3695 (5.28)         13;2   72.5290 (0.22)         60           1
test_aiokafka_throughput[400-8192]     15.1162 (5.94)     36.4955 (8.69)     17.2862 (5.72)     3.6237 (14.16)    16.4954 (5.66)     1.0247 (3.95)          2;5   57.8496 (0.17)         49           1
test_aiokafka_throughput[800-4096]     16.3741 (6.43)     20.1640 (4.80)     17.9042 (5.93)     0.9294 (3.63)     17.6925 (6.07)     0.9449 (3.64)         15;4   55.8529 (0.17)         48           1
test_aiokafka_throughput[800-8192]     22.3237 (8.77)     32.1267 (7.65)     25.7185 (8.51)     2.9603 (11.57)    24.5136 (8.40)     3.9794 (15.34)        12;0   38.8825 (0.12)         42           1
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

And these are with the python implementation:

--------------------------------------------------------------------------------------- benchmark 'aiokafka': 42 tests --------------------------------------------------------------------------------------
Name (time in ms)                           Min                 Max                Mean             StdDev              Median                IQR            Outliers       OPS            Rounds  Iterations
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_aiokafka_throughput[1-256]          2.3424 (1.0)        3.3216 (1.0)        2.5630 (1.0)       0.1652 (1.0)        2.5301 (1.0)       0.2086 (1.47)         41;5  390.1686 (1.0)         187           1
test_aiokafka_throughput[1-512]          2.3591 (1.01)      42.9855 (12.94)      2.7789 (1.08)      2.6879 (16.27)      2.5347 (1.00)      0.2274 (1.60)         1;10  359.8562 (0.92)        227           1
test_aiokafka_throughput[1-1024]         2.3982 (1.02)       4.4977 (1.35)       2.5915 (1.01)      0.2079 (1.26)       2.5434 (1.01)      0.1728 (1.22)        19;11  385.8833 (0.99)        194           1
test_aiokafka_throughput[1-2048]         2.5484 (1.09)       5.8763 (1.77)       2.8397 (1.11)      0.4069 (2.46)       2.8088 (1.11)      0.2562 (1.80)          5;4  352.1481 (0.90)        139           1
test_aiokafka_throughput[1-4096]         2.6524 (1.13)       9.2118 (2.77)       2.9212 (1.14)      0.5034 (3.05)       2.8369 (1.12)      0.1670 (1.17)          5;9  342.3198 (0.88)        196           1
test_aiokafka_throughput[20-256]         2.8953 (1.24)       5.0434 (1.52)       3.2009 (1.25)      0.3031 (1.83)       3.1278 (1.24)      0.2003 (1.41)        17;14  312.4148 (0.80)        209           1
test_aiokafka_throughput[1-8192]         3.0660 (1.31)      12.5326 (3.77)       3.5164 (1.37)      1.1325 (6.86)       3.3085 (1.31)      0.2231 (1.57)         5;11  284.3825 (0.73)        206           1
test_aiokafka_throughput[20-512]         3.4580 (1.48)       5.5636 (1.67)       3.7317 (1.46)      0.2578 (1.56)       3.6707 (1.45)      0.1618 (1.14)        12;10  267.9719 (0.69)        143           1
test_aiokafka_throughput[50-256]         3.7349 (1.59)       6.2574 (1.88)       4.0167 (1.57)      0.3242 (1.96)       3.9367 (1.56)      0.1421 (1.0)          8;10  248.9617 (0.64)        142           1
test_aiokafka_throughput[20-1024]        4.3155 (1.84)       8.7668 (2.64)       4.8071 (1.88)      0.6728 (4.07)       4.6647 (1.84)      0.2584 (1.82)          4;6  208.0257 (0.53)        106           1
test_aiokafka_throughput[50-512]         4.9716 (2.12)       8.9833 (2.70)       5.2628 (2.05)      0.4047 (2.45)       5.1947 (2.05)      0.1485 (1.05)          7;9  190.0136 (0.49)        110           1
test_aiokafka_throughput[100-256]        5.3119 (2.27)      10.2243 (3.08)       5.6930 (2.22)      0.5347 (3.24)       5.5914 (2.21)      0.2002 (1.41)          3;6  175.6532 (0.45)         95           1
test_aiokafka_throughput[20-2048]        6.2752 (2.68)      10.0552 (3.03)       6.5528 (2.56)      0.4368 (2.64)       6.4698 (2.56)      0.1716 (1.21)          3;6  152.6070 (0.39)         78           1
test_aiokafka_throughput[50-1024]        7.3875 (3.15)      22.4515 (6.76)       8.3188 (3.25)      2.4596 (14.89)      7.7623 (3.07)      0.3108 (2.19)          3;7  120.2097 (0.31)         63           1
test_aiokafka_throughput[100-512]        7.6203 (3.25)       8.6978 (2.62)       7.9488 (3.10)      0.1864 (1.13)       7.9367 (3.14)      0.2024 (1.42)         20;2  125.8046 (0.32)         72           1
test_aiokafka_throughput[200-256]        8.3091 (3.55)       9.5416 (2.87)       8.6877 (3.39)      0.2399 (1.45)       8.6495 (3.42)      0.2120 (1.49)         15;6  115.1051 (0.30)         69           1
test_aiokafka_throughput[20-4096]       10.0215 (4.28)      12.0249 (3.62)      10.5063 (4.10)      0.3072 (1.86)      10.4878 (4.15)      0.2806 (1.97)         24;4   95.1814 (0.24)         81           1
test_aiokafka_throughput[50-2048]       12.3220 (5.26)      51.5267 (15.51)     13.5171 (5.27)      4.7222 (28.59)     12.9485 (5.12)      0.3431 (2.41)          1;3   73.9805 (0.19)         67           1
test_aiokafka_throughput[100-1024]      12.4603 (5.32)      14.0228 (4.22)      13.0698 (5.10)      0.2865 (1.73)      13.0767 (5.17)      0.3436 (2.42)         19;1   76.5124 (0.20)         70           1
test_aiokafka_throughput[200-512]       13.0193 (5.56)      16.5878 (4.99)      13.7306 (5.36)      0.4762 (2.88)      13.6509 (5.40)      0.4782 (3.36)         14;1   72.8300 (0.19)         74           1
test_aiokafka_throughput[400-256]       14.6025 (6.23)      16.4865 (4.96)      15.3552 (5.99)      0.3188 (1.93)      15.3159 (6.05)      0.3790 (2.67)         14;2   65.1246 (0.17)         63           1
test_aiokafka_throughput[20-8192]       17.7873 (7.59)      19.9567 (6.01)      18.4777 (7.21)      0.3923 (2.38)      18.4242 (7.28)      0.5185 (3.65)         14;1   54.1192 (0.14)         52           1
test_aiokafka_throughput[50-4096]       21.9224 (9.36)      24.1567 (7.27)      23.1261 (9.02)      0.4092 (2.48)      23.1429 (9.15)      0.5535 (3.89)         10;1   43.2412 (0.11)         45           1
test_aiokafka_throughput[100-2048]      22.1476 (9.46)      26.1059 (7.86)      23.0560 (9.00)      0.5886 (3.56)      22.9704 (9.08)      0.4058 (2.86)          6;3   43.3727 (0.11)         43           1
test_aiokafka_throughput[200-1024]      22.6324 (9.66)      25.3995 (7.65)      23.9668 (9.35)      0.5912 (3.58)      23.9502 (9.47)      0.7066 (4.97)         12;2   41.7243 (0.11)         42           1
test_aiokafka_throughput[400-512]       24.0671 (10.27)     26.5516 (7.99)      25.2333 (9.85)      0.6201 (3.75)      25.1185 (9.93)      0.9739 (6.85)         14;0   39.6302 (0.10)         39           1
test_aiokafka_throughput[800-256]       26.8418 (11.46)     30.0656 (9.05)      27.9458 (10.90)     0.7042 (4.26)      27.8680 (11.01)     0.6456 (4.54)          7;3   35.7835 (0.09)         37           1
test_aiokafka_throughput[50-8192]       41.0898 (17.54)     44.1139 (13.28)     42.4833 (16.58)     0.8711 (5.27)      42.5029 (16.80)     1.2663 (8.91)          9;0   23.5387 (0.06)         24           1
test_aiokafka_throughput[400-1024]      42.1946 (18.01)     46.6887 (14.06)     43.6519 (17.03)     0.9996 (6.05)      43.3857 (17.15)     0.7128 (5.02)          4;2   22.9085 (0.06)         23           1
test_aiokafka_throughput[100-4096]      42.3928 (18.10)     45.7399 (13.77)     43.7459 (17.07)     1.0541 (6.38)      43.3820 (17.15)     1.9100 (13.44)         8;0   22.8593 (0.06)         23           1
test_aiokafka_throughput[200-2048]      43.3170 (18.49)     46.7597 (14.08)     44.7165 (17.45)     0.9514 (5.76)      44.6845 (17.66)     1.3299 (9.36)          9;0   22.3631 (0.06)         24           1
test_aiokafka_throughput[800-512]       46.0525 (19.66)     53.8723 (16.22)     47.4801 (18.53)     1.6647 (10.08)     47.0800 (18.61)     1.3950 (9.82)          1;1   21.0615 (0.05)         20           1
test_aiokafka_throughput[400-2048]      79.9063 (34.11)     84.9145 (25.56)     81.5831 (31.83)     1.5202 (9.20)      81.3022 (32.13)     2.1707 (15.27)         4;0   12.2574 (0.03)         12           1
test_aiokafka_throughput[100-8192]      81.9304 (34.98)     85.9707 (25.88)     83.5998 (32.62)     1.3033 (7.89)      83.3048 (32.93)     1.8384 (12.94)         4;0   11.9618 (0.03)         12           1
test_aiokafka_throughput[200-4096]      82.6204 (35.27)     86.5954 (26.07)     84.1309 (32.83)     1.4271 (8.64)      84.2550 (33.30)     2.1359 (15.03)         4;0   11.8862 (0.03)          9           1
test_aiokafka_throughput[800-1024]      84.0622 (35.89)     99.1440 (29.85)     86.8932 (33.90)     4.1233 (24.96)     85.7722 (33.90)     2.6785 (18.85)         1;1   11.5084 (0.03)         12           1
test_aiokafka_throughput[400-4096]     157.9699 (67.44)    163.3672 (49.18)    160.3500 (62.56)     2.1783 (13.19)    159.8016 (63.16)     3.9458 (27.76)         3;0    6.2364 (0.02)          6           1
test_aiokafka_throughput[200-8192]     164.4193 (70.19)    172.8172 (52.03)    167.8717 (65.50)     3.3122 (20.05)    166.9899 (66.00)     5.6250 (39.58)         3;0    5.9569 (0.02)          7           1
test_aiokafka_throughput[800-2048]     165.1656 (70.51)    174.2200 (52.45)    169.5745 (66.16)     3.7738 (22.85)    168.4172 (66.57)     7.1064 (50.00)         3;0    5.8971 (0.02)          6           1
test_aiokafka_throughput[400-8192]     309.5131 (132.14)   329.2418 (99.12)    316.0935 (123.33)    7.9628 (48.21)    315.1996 (124.58)    9.7401 (68.54)         1;0    3.1636 (0.01)          5           1
test_aiokafka_throughput[800-4096]     316.9306 (135.30)   336.2488 (101.23)   323.7794 (126.33)    8.2775 (50.11)    320.1210 (126.53)   12.7928 (90.02)         1;0    3.0885 (0.01)          5           1
test_aiokafka_throughput[800-8192]     643.5447 (274.74)   693.9136 (208.91)   661.1124 (257.95)   20.3392 (123.14)   657.7583 (259.97)   27.1496 (191.04)        1;0    1.5126 (0.00)          5           1
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

That's a fairly substantial hit, specially towards bigger data sizes where the difference is ~27x. And that's with all the kafka-related overheads -- if you compared the crc32c implementations on their own the difference would be higher still.

rtobar commented 1 year ago

Forgot to mention: for reference, this is the benchmark we were using: https://gitlab.com/ska-telescope/sdp/ska-sdp-realtime-receive-processors/-/blob/8641b4c518e370a8592c3f2a0c89ab4da2e3ef60/tests/unit/test_pointing_performance.py#L48-70

KeatonWakefield01 commented 1 year ago

Awesome thank you!