JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.43k stars 5.45k forks source link

BitSet construction could be faster #50326

Open LilithHafner opened 1 year ago

LilithHafner commented 1 year ago

For most source collections, it is faster to iterate once to compute extrema and then a second time to populate the BitSet than our current approach to dynamically resize (and check bounds on each new element).

See #46840 for an abandoned draft.

Qfl3x commented 8 months ago

A bit "off-topic", I'm trying to implement a toy Roaring Bitmap and this is my first time dealing with "weird" datastructures. Just want to ask why use a Vector of UInt64 instead of a BitArray?

LilithHafner commented 8 months ago

A BitArray is itself a wrapper around Vector{UInt64} with a few extra fields.

https://github.com/JuliaLang/julia/blob/713560bb4fe190d59b212021a5d0eda5ffba1010/base/bitarray.jl#L24-L43

I didn't write the original BitSet code, but I imagine the authors chose to wrap Vector{UInt64} directly rather than wrapping another wrapper in order to more easily avoid additional overhead.