This change makes it so that we try to read Ion INTs as int64s and
convert them to Python with PyLong_FromLongLong. If the int is too
large then we fallback to converting via base 10 string repr.
I don't like adding complexity but this is significantly faster for
datasets with small ints. For example, the "service_log_legacy"
benchmark is about 7% faster with this change.
Note for Reviewers:
I looked at other options for efficiently creating PyLongs (all Python
ints are actually "Longs" in Python 3) from Ion INTs. The highest
"bandwith" way to do arbitrary size ints is getting them as two's
complement byte arrays. Unfortunately there is no Python C API
call for that, so you need to go through the Object Call protocol
which adds a lot of overhead. There is no existing hex or base 32
char array export that we could give to PyLong_FromString. Even if
there were that path still requires allocations.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
This change makes it so that we try to read Ion INTs as int64s and convert them to Python with PyLong_FromLongLong. If the int is too large then we fallback to converting via base 10 string repr.
I don't like adding complexity but this is significantly faster for datasets with small ints. For example, the "service_log_legacy" benchmark is about 7% faster with this change.
Note for Reviewers: I looked at other options for efficiently creating PyLongs (all Python ints are actually "Longs" in Python 3) from Ion INTs. The highest "bandwith" way to do arbitrary size ints is getting them as two's complement byte arrays. Unfortunately there is no Python C API call for that, so you need to go through the Object Call protocol which adds a lot of overhead. There is no existing hex or base 32 char array export that we could give to PyLong_FromString. Even if there were that path still requires allocations.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.