serge-sans-paille / pythran

Ahead of Time compiler for numeric kernels
https://pythran.readthedocs.io
BSD 3-Clause "New" or "Revised" License
1.98k stars 191 forks source link

are there workarounds to avoid copying function non-array arguments? #2012

Open gdementen opened 1 year ago

gdementen commented 1 year ago

Hello. Thanks for the great library.

I have code that looks like this:

#pythran export look_pythran(int[], int:int dict)
#pythran export look_pythran(int list, int:int dict)
def look_pythran(key, mapping):
    if isinstance(key, np.ndarray):
        return np.array([mapping[k] for k in key])
    else:
        return [mapping[k] for k in key]

Using small mappings, the performance is very nice, especially for large keys... but for large mappings, Pythran is much slower than the pure Python version. I suppose this is because Pythran copies the arguments (https://pythran.readthedocs.io/en/latest/MANUAL.html#limitations) but I haven't found a way to workaround this.

For example,

Thanks for any help

serge-sans-paille commented 1 year ago

Interesting question, sorry for not answering earlier. There's currently no straight-forward way, but it would be great to have one, indeed. Among the three solution you propose, the first one makes most sense to me. I wonder what would be the best syntax to support this... probably a dedicated type?

gdementen commented 1 year ago

Hmm. FWIW, I never answered this because I agree with you... But sometimes an explicit answer is better than an implicit one 😉.

IMO, the dedicated type is indeed the best you could offer, as it would make it possible to control when the conversion actually happens and/or when a conversion needs to happen again, whereas something like a "cached" modifier for the argument in the pythran function annotation (the only other option I see -- but I might be short-sighted on this) would only support caching when first-called. But maybe the "cached" argument modifier could lead to more optimizations being possible?? In that case, you could support both.