Closed lemire closed 11 months ago
@thanhpk @klauspost @omerfirmak @king526 : Please review/comment.
I really don't understand why our code is so convoluted when what I wrote is just... obvious? Is there a catch? Why do we have all this complexity in the first place ?
@lemire you are trading space for time, it would alloc a big chunk of memory, crashing some programs if the bitset is big enough
We already discussed this https://github.com/bits-and-blooms/bitset/pull/103
@lemire you are trading space for time
-benchmem
to see the GC hit you are taking. We, at Nethermind, don't depend on bitset
anymore (due to https://github.com/bits-and-blooms/bitset/issues/134). But I would rather keep the GC at minimum.
@thanhpk
it would alloc a big chunk of memory, crashing some programs if the bitset is big enough
I don't think it would. The overhead memory usage should still effectively constant:
func readUint64Array(reader io.Reader, data []uint64) error {
length := len(data)
bufferSize := 128
buffer := make([]byte, bufferSize*int(wordBytes))
for i := 0; i < length; i += bufferSize {
end := i + bufferSize
if end > length {
end = length
buffer = buffer[:wordBytes*uint(end-i)]
}
chunk := data[i:end]
if _, err := io.ReadFull(reader, buffer); err != nil {
return err
}
for i := range chunk {
chunk[i] = uint64(binaryOrder.Uint64(buffer[8*i:]))
}
}
return nil
}
func writeUint64Array(writer io.Writer, data []uint64) error {
bufferSize := 128
buffer := make([]byte, bufferSize*int(wordBytes))
for i := 0; i < len(data); i += bufferSize {
end := i + bufferSize
if end > len(data) {
end = len(data)
buffer = buffer[:wordBytes*uint(end-i)]
}
chunk := data[i:end]
for i, x := range chunk {
binaryOrder.PutUint64(buffer[8*i:], x)
}
_, err := writer.Write(buffer)
if err != nil {
return err
}
}
return nil
}
@omerfirmak
I would rather keep the GC at minimum.
Yeah. It seems that binary.Read
and binary.Write
are allocating functions. But we can just reuse the buffer, see my code above.
@lemire I agree
On one benchmark, this PR doubles the performance. With this PR, we reach 1.7 GB/s (reading and writing to memory). This is for reading and writing, so that the bandwidth is 3.4 GB/s which is not great, but... Note that this for in-memory... if you write and read to a disk... it is going to be slower.
Before...
After...
Fixes https://github.com/bits-and-blooms/bitset/issues/141