This is an alternative to #102. At a high-level, it implements the "compact memory representation" described in https://github.com/cockroachdb/apd/pull/102#issuecomment-1005078022. Compared to the approach in that PR, this approach is a) faster for most operations, b) more usable because values can be safely copied, c) half the memory size (32 bytes per Decimal, vs. 64).
The memory representation of the Decimal struct in this approach looks like:
type Decimal struct {
Form int8
Negative bool
Exponent int32
Coeff BigInt {
_inner *big.Int // nil when value fits in _inline
_inline [2]uint
}
} // sizeof = 32
With a two-word inline array, any value that would fit in a 128-bit integer (i.e. decimals with a scale-adjusted absolute value up to 2^128 - 1) fit in _inline. The indirection through _inner is only used for values larger than this.
Before this change, the memory representation of the Decimal struct looked like:
type Decimal struct {
Form int64
Negative bool
Exponent int32
Coeff big.Int {
neg bool
abs []big.Word {
data uintptr ---------------.
len int64 v
cap int64 [uint, uint, ...] // sizeof = variable, but around cap = 4, so 32 bytes
}
}
} // sizeof = 48 flat bytes + variable-length heap allocated array
This commit introduces a performance optimization that embeds small coefficient values directly in their Decimal struct, instead of storing these values in a separate heap allocation. It does so by replacing math/big.Int with a new wrapper type called BigInt that provides an "inline" compact representation optimization.
Each BigInt maintains (through big.Int) an internal reference to a variable-length integer value, which is represented by a []big.Word. The _inline field and the inner and updateInner methods combine to allow BigInt to inline this variable-length integer array within the BigInt struct when its value is sufficiently small. In the inner method, we point a temporary big.Int's nat slice at this _inline array. big.Int will avoid re-allocating this array until it is provided with a value that exceeds the initial capacity. Later in updateInner, we detect whether the array has been re-allocated. If so, we switch to using the _inner. If not, we continue to use this array.
We set the capacity of the inline array to accommodate any value that would fit in a 128-bit integer (i.e. values with an absolute value up to 2^128 - 1).
This is an alternative to an optimization that many other arbitrary precision decimal libraries have where small coefficient values are stored as numeric fields in their data type's struct. Only when this coefficient value gets large enough do these libraries fall back to a variable-length coefficient with internal indirection. We can see the optimization in practice in the ericlagergren/decimal library, where each struct contains a compact uint64 and an unscaled big.Int. Prior concern from the authors of cockroachdb/apd regarding this form of compact representation optimization was that it traded performance for complexity. The optimization fractured control flow, leaking out across the library and leading to more complex, error-prone code.
The approach taken in this commit does not have the same issue. All arithmetic on the decimal's coefficient is still deferred to bit.Int.
Fourth time's the charm.
Replaces cockroachdb/cockroach#74369. Replaces https://github.com/cockroachdb/apd/pull/101. Replaces https://github.com/cockroachdb/apd/pull/102.
This is an alternative to #102. At a high-level, it implements the "compact memory representation" described in https://github.com/cockroachdb/apd/pull/102#issuecomment-1005078022. Compared to the approach in that PR, this approach is a) faster for most operations, b) more usable because values can be safely copied, c) half the memory size (32 bytes per
Decimal
, vs. 64).The memory representation of the Decimal struct in this approach looks like:
With a two-word inline array, any value that would fit in a 128-bit integer (i.e. decimals with a scale-adjusted absolute value up to 2^128 - 1) fit in
_inline
. The indirection through_inner
is only used for values larger than this.Before this change, the memory representation of the Decimal struct looked like:
This commit introduces a performance optimization that embeds small coefficient values directly in their
Decimal
struct, instead of storing these values in a separate heap allocation. It does so by replacingmath/big.Int
with a new wrapper type calledBigInt
that provides an "inline" compact representation optimization.Each
BigInt
maintains (throughbig.Int
) an internal reference to a variable-length integer value, which is represented by a []big.Word. The _inline field and the inner and updateInner methods combine to allow BigInt to inline this variable-length integer array within the BigInt struct when its value is sufficiently small. In the inner method, we point a temporary big.Int's nat slice at this _inline array. big.Int will avoid re-allocating this array until it is provided with a value that exceeds the initial capacity. Later in updateInner, we detect whether the array has been re-allocated. If so, we switch to using the _inner. If not, we continue to use this array.We set the capacity of the inline array to accommodate any value that would fit in a 128-bit integer (i.e. values with an absolute value up to 2^128 - 1).
This is an alternative to an optimization that many other arbitrary precision decimal libraries have where small coefficient values are stored as numeric fields in their data type's struct. Only when this coefficient value gets large enough do these libraries fall back to a variable-length coefficient with internal indirection. We can see the optimization in practice in the
ericlagergren/decimal
library, where each struct contains acompact uint64
and anunscaled big.Int
. Prior concern from the authors ofcockroachdb/apd
regarding this form of compact representation optimization was that it traded performance for complexity. The optimization fractured control flow, leaking out across the library and leading to more complex, error-prone code.The approach taken in this commit does not have the same issue. All arithmetic on the decimal's coefficient is still deferred to
bit.Int
.Impact on benchmarks: