hpyproject / hpy

HPy: a better API for Python
https://hpyproject.org
MIT License
1.1k stars 53 forks source link

Microbench test_allocate_obj are slower in HPy than with the old Python/C API #362

Open antocuni opened 2 years ago

antocuni commented 2 years ago

These are the microbenchmarks: https://github.com/hpyproject/hpy/blob/6fcb15e13611f111cfb638573e84e9604458abf3/microbench/test_microbench.py#L103-L112 https://github.com/hpyproject/hpy/blob/99bf0425168e2c22902e2f8ff01b1bf9f59bd913/microbench/test_microbench.py#L165-L174

Here, I am comparing:

We should expect 0 overhead, but we are ~15% slower, at least on my machine: I ran the benchmarks several times: static types are consistently slightly faster than heap types even in the cpy case, but HPy is consistently much slower in both cases:

                                                     cpy                    hpy
                                        ----------------    -------------------
TestType::test_allocate_obj                    670.44 us       783.15 us [1.17]
TestHeapType::test_allocate_obj                696.72 us       788.42 us [1.13]
cklein commented 1 year ago

The first difference I can see is that cpy_simple.Foo calls PyType_GenericNew and hpy_simple.Foo calls object_new

The latter looks more complicated to my untrained eyes (although they both call type->tp_alloc eventually)

fangerer commented 1 year ago

@cklein you are probably right. That looks like the reason since object_new is doing additional argument checking and also created the object's dict eagerly. Thanks for the hint. I'll have a look how to fix that.