Closed matthiasdiener closed 1 year ago
Yes, something like that would work.
When built against nanobind (+ #116), this provides a modest speedup for the microbenchmark on M1:
I'm OK with option b) in both cases. I'm not sure option a) would work unmodified. You would likely need to call the casting function manually somewhere in there in order to get it to work, and this may be incrementally faster than option b). But a lot depends on exactly how much faster, and whether that justifies the extra effort (and generated code).
At the same time, there's a small amount of speed-up here, and we're net deleting code: I'd call it a win regardless and would be happy to merge this, I think.
I'm OK with option b) in both cases. I'm not sure option a) would work unmodified. You would likely need to call the casting function manually somewhere in there in order to get it to work, and this may be incrementally faster than option b). But a lot depends on exactly how much faster, and whether that justifies the extra effort (and generated code).
Option a) works in the way it is written above (possibly also due to making use of implicitly_convertible
?). Both options are very close performance-wise in my test.
Thanks!
Is this the approach you had in mind @inducer? (the actual code to create the upcasts would be added to gen_wrap.py)
1. Substitute for
make_new_upcast_wrapper
:Option a)
Option b)
(I think the actual cast is handled by
impicitly_convertible
)2. Substitute for
make_existing_upcast_wrapper
:Option a)
Option b)
Edit: I implemented option b) for both cases.