Kotlin / kotlinx-io

Kotlin multiplatform I/O library
Apache License 2.0
1.28k stars 57 forks source link

Add support for writing other primitive arrays to Buffers #357

Open DRSchlaubi opened 3 months ago

DRSchlaubi commented 3 months ago

Due to the deprecation of the Ktor I/O APIs we need to migrate this code, however with kotlinx-io there is currently no easy way to replace this

internal actual fun formatIntegerFromLittleEndianLongArray(data: LongArray) =
    withBuffer(data.size * Long.SIZE_BYTES) {
        // need to convert from little-endian data to big-endian expected by BigInteger
        writeFully(data.reversedArray())
        BigInteger.fromByteArray(readBytes(), Sign.POSITIVE).toString()
    }
JakeWharton commented 3 months ago

If performance is a concern here, I would recommend that you create the byte array directly and then write the longs by iterating in reverse and writing their bytes directly with shifts. This will produce zero extra allocations or copies. The current code has a lot of wasted allocations and copies.

If you just want to do this with kotlinx-io you can do something like

val buffer = Buffer()
data.reversedArray().forEach(buffer::writeLong)
BigInteger.fromByteArray(data.readByteArray(), Sign.POSITIVE).toString()

This wastes approximately the same amount of allocations as the original code.

A very easy improvement would be to avoid the LongArray copy

val buffer = Buffer()
data.indices.reversed().forEach { buffer.writeLong(data[it]) }
BigInteger.fromByteArray(buffer.readByteArray(), Sign.POSITIVE).toString()

I'll also note that you don't appear to actually be doing an endian-ness switch. The longs are written in reverse, but their byte order is maintained. If you need an actual endian switch you can use writeLongLe for the function reference.

fzhinkin commented 2 months ago

One of the concerns regarding API to read/write arbitrary primitive array types is the API surface bloating. By following what we have for primitive types and for bytes array, following could be added:

Something around 42 functions, that's a lot! Especially, given that most of them may remain unused.

However, we can provide optimized versions of these operations (using the Unsafe API and, for example, var handles or byte buffer views on JVM).