uazu / qcell

Statically-checked alternatives to RefCell and RwLock
Apache License 2.0
356 stars 22 forks source link

Safer (checked) `QCellOwnerSeq::new` #50

Open CAD97 opened 3 months ago

CAD97 commented 3 months ago

std's Rc, Arc, and ThreadId have essentially the same pathological edge case as QCellOwnerSeq — incrementing the reference count (owner id sequence) in a loop without ever decreasing it (i.e. by forgetting the cloned Rc/Arc, or always for the monotonic ThreadId and QCellOwnerSeq counts) could overflow the counter and lead to UB. std declares this as pathological unsupported behavior — when would you ever legitimately need even 231 different owning references or threads, let alone 263[^1] — and aborts if this overflow would happen[^2].

[^1]: For some context, with 128 threads all cooperatively incrementing the same shared counter at a rate of 6 GHz, it will still take 139 days at that rate to increment the the counter 263 times. It's effectively never going to happen accidentally.

[^2]: Rc uses cell.set(cell.get() + 1) and aborts when overflow happens. Arc uses atom.fetch_add(1, Relaxed) and aborts at isize::MAX, relying on the fact that having isize::MAX threads cloning Arc concurrently is impossible (at least two bytes of address space are consumed per thread). ThreadId uses a compare_exchange loop, presumably because creating a new thread will never be performance critical like cloning Arc can be.

At least on 64-bit targets (where exhausting the ID space is fundamentally impractical[^3]), it'd be nice to have QCellOwnerSeq::new be made safe. I'd recommend doing a cmpxchg loop (fetch_update) like ThreadId, since creating new owners doesn't seem like it'd ever be a contended operation.

[^3]: Counting to 263 is never going to happen in a reasonable timeframe (see previous footnote), but counting to 232 is entirely practical. OTOH, 32-bit targets with 64-bit atomic support could still use 64-bit owner IDs, zero-extending the address-based owner IDs.

Doing this would entail one of:

[^4]: In both cases QCellOwner could also switch to using checked sequence-based IDs to avoid the alloc requirement, but do note that this would result in the ID space being consumed without reuse by QCellOwner as well as QCellOwnerSeq, which it isn't currently.

Since qcell doesn't expose a way to get the numeric value of a QCellOwnerID (and I think this is a good decision), there's no way to implement this downstream, even partially.

uazu commented 3 months ago

Thanks for the comments, and sorry for the delay in replying.

I guess this must be for a no_std situation where you can't allocate? Otherwise QCellOwner::new is much easier. For no_std everyone is working at a much lower level, and making guarantees with unsafe is more often necessary, so I thought that it would be acceptable.

Also, from the reaction in the forums, it seems like any unsoundness at all, even once every 139 days with a determined effort, is regarded as a demonstrated weakness and therefore a bug. So to be conservative, this is marked as unsafe, even if for all practical non-malicious purposes it is totally sound. Originally I intended the number of bits in the ID to "tend to zero" like in maths once testing was complete, but after the reaction in the forums I realized that would never fly. For practical purposes, the ID could just be a wrapping 8 bits and still provide enough of a check to catch legitimate coding errors. So the unsafe just acts as a flag to the reviewer to make sure that the code isn't making a crazy attempt to wrap the ID and get a collision.

Since the ID goes on every single QCell, it was originally 32-bit everywhere. However when it was proposed to me to use addresses as a basis, I felt that a 64-bit ID would be acceptable on 64-bit because alignment might cause it to take up 64 bits anyway. 32 bits on 32-bit continues to make sense to not bloat all the cells.

I feel we got the balance about right on this one. Personally I prefer not to have long-term panics hanging over me, so using addresses or a wrapping ID means they will never run out.

Considering the option of adding both panicking and non-panicking versions of QCellOwner::new, they would have to be allocated separate ranges, or else use of the non-panicking one could wrap the value meaning that the panicking one wouldn't notice the wrap. So each would get 2^62 (or 2^30) values.

So I think I can do what you're asking, by adding a new panicking call with its own ID range. Does this sound okay?

I do wonder what the situation is where someone would prefer a hidden panic rather than no panic and an explicit "Safety: ..." notice in the code. But really I don't need to understand that. If you want this call it is not a big deal for me to add it.

So if you agree, I have two actions from this: