contain-rs / bit-vec

A Vec of Bits
Apache License 2.0
169 stars 59 forks source link

Implement `BitArray` type? #112

Open Jujumba opened 3 months ago

Jujumba commented 3 months ago

This may also help to close #9

pczarn commented 3 months ago

Array without the word "dynamic" means fixed length, so I'd call this like ferrilabs: BitStore.

Could be done like this

pub trait BitStore {
    fn get(&self, idx: usize) -> bool;
    fn push(&mut self, elem: bool);
}

pub struct DynamicBitStore<B: BitBlock> {
    ary: Vec<B>,
}

pub struct HybridBitStore<B: BitBlock> {
    Stack([B; 3]),
    Heap(DynamicBitStore<B>),
}

impl<B: BitBlock> BitStore for DynamicBitStore<B>

impl<B: BitBlock> BitStore for HybridBitStore<B>

struct BitVec<S: BitStore> {
    store: S,
    nbits: usize,
}

// BitCursorSlice
pczarn commented 3 months ago

Okay, So I may be wrong, because ferrilabs have stuff like BitArray too, not sure how it works, but I guess certainly useful to do "directly owned fixed size bit-vecs" like BitSlice but owned

Jujumba commented 3 months ago

Array without the word "dynamic" means fixed length, so I'd call this like ferrilabs: BitStore.

Could be done like this

pub trait BitStore {
    fn get(&self, idx: usize) -> bool;
    fn push(&mut self, elem: bool);
}

pub struct DynamicBitStore<B: BitBlock> {
    ary: Vec<B>,
}

pub struct HybridBitStore<B: BitBlock> {
    Stack([B; 3]),
    Heap(DynamicBitStore<B>),
}

impl<B: BitBlock> BitStore for DynamicBitStore<B>

impl<B: BitBlock> BitStore for HybridBitStore<B>

struct BitVec<S: BitStore> {
    store: S,
    nbits: usize,
}

// BitCursorSlice

I think that's useless to have bit storage which may be either stack or heap allocated

The BitArray (limited stack-allocated storage) would be a better option, since:

pczarn commented 3 months ago

We already have a dynamically allocated BitVec

The goal is to have BitVec<S: BitStore = DynamicBitStore<u32>> precisely for BitVec itself to depend on BitStore

pczarn commented 3 months ago

It wasn't obvious from my code, so it could have confused you, but that's my intention

Jujumba commented 3 months ago

We already have a dynamically allocated BitVec

The goal is to have BitVec<S: BitStore = DynamicBitStore<u32>> precisely for BitVec itself to depend on BitStore

Then what's the point of HybridBitStore?

Jujumba commented 3 months ago

Array without the word "dynamic" means fixed length, so I'd call this like ferrilabs: BitStore. Could be done like this

pub trait BitStore {
    fn get(&self, idx: usize) -> bool;
    fn push(&mut self, elem: bool);
}

pub struct DynamicBitStore<B: BitBlock> {
    ary: Vec<B>,
}

pub struct HybridBitStore<B: BitBlock> {
    Stack([B; 3]),
    Heap(DynamicBitStore<B>),
}

impl<B: BitBlock> BitStore for DynamicBitStore<B>

impl<B: BitBlock> BitStore for HybridBitStore<B>

struct BitVec<S: BitStore> {
    store: S,
    nbits: usize,
}

// BitCursorSlice

I think that's useless to have bit storage which may be either stack or heap allocated

The BitArray (limited stack-allocated storage) would be a better option, since:

* We already have a dynamically allocated `BitVec`

* We can transition between stack-allocated `BitArray` to `BitVec`, extending it

Ah, I see

But I do believe that this is misleading: BitVec may allocate data on stack

It would be much simpler (and, in fact, more convenient) to leave BitVec with heap allocations, and provide a separated type (BitArray), with a stack allocated buffer

@pczarn , what do you think?

pczarn commented 3 months ago

@Jujumba there is truth to what you said about simplicity, my new, much simpler idea is to introduce SmallVec and ZeroVec both as completely optional dependencies / features, thus adding very little code.

pczarn commented 3 months ago

@Jujumba The BitStore stuff could be very simple, with just create, store block, load block, get slice, push, extend, and truncate, and perhaps that's all.

pczarn commented 3 months ago

Also, I'd call it BitBlockStore

Jujumba commented 3 months ago

@Jujumba there is truth to what you said about simplicity, my new, much simpler idea is to introduce SmallVec and ZeroVec both as completely optional dependencies / features, thus adding very little code.

And you want to add these structures as a possible buffer to BitVec?

pczarn commented 3 months ago

@Jujumba Yes, that's right

Jujumba commented 3 months ago

@Jujumba Yes, that's right

And how that "hybrid" vector will extend itself if it exceeds capacity?

pczarn commented 3 months ago

@Jujumba the SmallVec takes care of that. First, it puts bits on the stack, and once the capacity of the stack is exceeded, it allocates onto the heap, behaving just like a Vec. It remains on the heap forever unless you call shrink_to_fit when the data can fit on the stack.

In my estimation, we could store 64 bits with SmallVec, or up to 95 bits with our new very custom implementation for storing bits, on the stack.

Edit: on 64-bit machines, it will be 128 bits and up to 173 bits.

pczarn commented 3 months ago

This is just like some Rust libraries that can store up to 23 characters/bytes long strings on the stack within a 24 byte info of a dynamic array. You can make the same thing with bits, and SmallVec provides kinda that.

Jujumba commented 3 months ago

@Jujumba the SmallVec takes care of that. First, it puts bits on the stack, and once the capacity of the stack is exceeded, it allocates onto the heap, behaving just like a Vec. It remains on the heap forever unless you call shrink_to_fit when the data can fit on the stack.

In my estimation, we could store 64 bits with SmallVec, or up to 95 bits with our new very custom implementation for storing bits, on the stack.

Edit: on 64-bit machines, it will be 128 bits and up to 173 bits.

We could still achieve that with BitVec implementing From<BitArray> (it will be just more explicit) I think