Text de/encoding abstraction

oscbyspro commented 1 year ago

The text conversions of NBKDoubleWidth and NBKFlexibleWidth can almost be abstracted and packaged as a standalone module, perhaps as a formatter. I think that would be neat. I'll see what changes I need to make, I don't think it's much.

oscbyspro commented 1 year ago

One idea that does not require additional protocol requirements is copying the words to a temporary allocation and then doing whatever from there. Copying the words has some overhead, but I don't know whether it's significant compared to the rest of it (I doubt it). With some conveniences on StrictBinaryInteger (#85) it wouldn't even be that bad to perform arithmetic on an UnsafeMutableBufferPointer. (On the topic of encoding.)

oscbyspro commented 1 year ago

Note: if radix literal encoding is important, then I think it should be part of a dedicated hexadecimal (etc) formatter. I think that's cleaner, because otherwise you end up with some silly branch checking the radix or whether the radix matches the literal. It's also non-option for most radices.

oscbyspro commented 1 year ago

At a glance, I think I can abstract the decoding part (efficiently) if binary integers have an init(words:) method.

oscbyspro commented 1 year ago

Maybe I want some kind of static maxBitWidth, so I can break too-large-fixed-width decoding early. Or a static bitWidth: Int? to check for fixed-width? Just throwing some ideas around. A single implementation would be nice.

oscbyspro commented 1 year ago

I can save some multiplications with a staggered approach (more significant for big integers):

while ... {
    .....
    words[endIndex] = SUISS.multiplyFullWidth(&words, by: radix.power, add: word, upTo: endIndex) // new
    words.formIndex(after: &endIndex)
}

Additionally, you don't need to pre-initialize the memory when doing that. You can do it in-loop.

oscbyspro commented 1 year ago

Hm. If I'm using an init(words:) method, maybe I should try ContigousArray(...) vs withUnsafeTemporaryAllocation(...), since a big integer might just take the array without copying anything (assuming the big integer is array-based).

Edit: withUnsafeTemporaryAllocation(...) much better, for fixed-width at least.

oscbyspro commented 1 year ago

I see some initial performance hits, but I believe I can fix most of them. Still, it's a huge maintainability win if NBKDoubleWidth, NBKFlexibleWidth and NBKSigned can share one text decoding/encoding implementation.

oscbyspro commented 1 year ago

I don't think it will end up being too complex, so I might create a private namespace rather than a new module.

oscbyspro commented 1 year ago

Looking at Instruments, it appears that UInt256.init(words:) accounts for about 5% of UInt256.init(_:radix:).

oscbyspro commented 1 year ago

It turns out that short circuiting decoding based on length is a bit finicky and can only be done exactly in base 2.

Maybe I'll let it behave as-if I'm converting it to a big integer first (which is basically what I'm doing with pointers).

oscbyspro commented 1 year ago

Hm. I suppose I can perform perfect radix decoding without a temporary allocation, via a custom lazy collection.

Edit: Hm, maybe not. The subscript would need to be throwing or return optional elements.

oscbyspro commented 1 year ago

Hm. Most of the decoding performance issue is solved by making the decoding algorithm non-generic and then:

@inlinable static func decode<Magnitude: NBKUnsignedInteger>(digits: NBK.UnsafeUTF8, solution: PerfectRadixSolution<UInt>, as type: Magnitude.Type = Magnitude.self) -> Magnitude? {
    var    magnitude:   Magnitude?
    self.decode(digits: digits, solution: solution) { magnitude = Magnitude(words: $0) }
    return magnitude as Magnitude?
}

@usableFromInline static func decode(digits: NBK.UnsafeUTF8, solution: PerfectRadixSolution<UInt>, perform: (NBK.UnsafeWords) -> Void) -> Void? { 
    ...
}

oscbyspro commented 1 year ago

Note to self: merge future branches with pull requests so you have something to reference later. Zzz.

oscbyspro commented 1 year ago

The the encoder works for Signed\<Magnitude>, but I need a sign-magnitude-pair decoding method.

oscbyspro / Numberick

Text de/encoding abstraction #86