amazon-ion / ion-python

A Python implementation of Amazon Ion.
https://amazon-ion.github.io/ion-docs/
Apache License 2.0
260 stars 51 forks source link

Avoids unnecessary method invocations in IonPyDict's add_item method. #290

Closed cheqianh closed 1 year ago

cheqianh commented 1 year ago

Description:

IonPyDict's add_item is called recursively many times for Ion data that includes IonPyDict. This pull request introduces a 4-5% performance improvement by eliminating overhead caused by unnecessary method calls.

Highlights:

I attempted to rewrite the C extension part as well to insert key-value pairs into the dictionary directly instead of calling add_item (see code line 210-214), but I found that it was even slower than invoking the method directly.

Additionally, we are using OrderedDict to store key-value pairs which cannot use the raw CPython Dict API such as PyDict_SetItem to insert values.

Performance benchmarking

It shows a 4-5% improvement for the sample file we are using. However, for specific use cases such as a large dictionary with unique (non-repeated) keys, it can reduce lots of unnecessary internal method invocations.

committed complex legacy
Config iterations:100 warmups:10 iterations:100 warmups:10 iterations:80 warmups:5
Before time_mean 12,120 777,385 13,526,830,177
After time_mean 11,661 736,649 13,339,152,123
Improvement -4% -5.2% -1.4%

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.