amazon-ion / ion-python

A Python implementation of Amazon Ion.
https://amazon-ion.github.io/ion-docs/
Apache License 2.0
253 stars 50 forks source link

Try to read and convert Ion INT as int 64 #320

Closed rmarrowstone closed 6 months ago

rmarrowstone commented 6 months ago

This change makes it so that we try to read Ion INTs as int64s and convert them to Python with PyLong_FromLongLong. If the int is too large then we fallback to converting via base 10 string repr.

I don't like adding complexity but this is significantly faster for datasets with small ints. For example, the "service_log_legacy" benchmark is about 7% faster with this change.

Note for Reviewers: I looked at other options for efficiently creating PyLongs (all Python ints are actually "Longs" in Python 3) from Ion INTs. The highest "bandwith" way to do arbitrary size ints is getting them as two's complement byte arrays. Unfortunately there is no Python C API call for that, so you need to go through the Object Call protocol which adds a lot of overhead. There is no existing hex or base 32 char array export that we could give to PyLong_FromString. Even if there were that path still requires allocations.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.