Use crossbeam-queue for qcell free list

geeklint commented 2 years ago

I'm working on seeing what it would take to add no_std support to qcell, and the use of Mutex when assigning QCellOwner ids is one obvious sticking point. This PR proposes replacing the Lazy<Mutex<SafeQCellOwnerIDSource>> with crossbeam_queue::SegQueue and an AtomicUsize.

This will make a future no_std change easier, but it comes with some caveats so I wanted to propose it separately.

Some possible talking points:

Additional dependency: I targeted only the minimum feature set of crossbeam_queue required to enable this change, but it would still mean qcell has an additional dependency.
QCellOwnerId reuse order: This change causes IDs to be reused in a FIFO order, rather than LIFO. The documentation does not specify what order IDs will be reused in, but this is technically an observable change (and I had to slightly edit one of the test cases because of it).
Memory allocation pattern changes: While this change likely doesn't allocate meaningfully more memory, the way it allocates is subtly changed. Consider an app that new()s a large number of QCellOwners, then drops them all, and then new()s a similar number. With the original version, the first drop will allocate a large buffer and never free it, reusing it for the second wave of drops. The behavior as a consequence of this PR will allocate memory when QCellOwners are dropped, free it when they are reused, and then allocate again when they are dropped a second time. The change also means that the performance of QCellOwner::drop is more consistent, as it will never need to reallocate and copy a potentially large list.
AtomicUsize vs AtomicU32: since IDs are u32s, it might make more sense to use AtomicU32 here, I only didn't because the FAST_ version was using an AtomicUsize and I figured it was for a reason. I can hypothesize that AtomicUsize was used previously because the std docs suggest it is "generally the most portable".

uazu commented 2 years ago

Thanks for looking at this. Yes, it would be good to have no_std support. The original QCellOwner code just did the fast_new, i.e. did not need the mutex. This gave as much protection practically, but could be bypassed to get unsound behaviour by a malicious coder, i.e. was technically a bug. So I added the mutex and list. Really this is not my preferred solution, because it takes memory and doesn't scale as well.

However creating new owners should be relatively infrequent, e.g. you'd expect an owner to look after quite a lot more memory than it occupies itself, so if owners are being destroyed and recreated, then that's only a small part of the associated cost (of all the owned memory which is also destroyed and recreated).

The problem is to get a unique number, and using the address of an allocated object in memory would solve that problem in another way without the free list (or even using the address of the owner itself, as in #14). However this would make the ID a usize.

I'm not too familiar with the no_std restrictions, so I will read up on them later, and also look at the crossbeam_queue and your PR. Yes, I'd prefer to avoid more dependencies if possible. But also I'm not attached to the present solution, and I'd be happy to consider some other way of getting a similar result, so it's worth looking at the bigger picture too. I'll reply again later when I've had some time to look at this.

geeklint commented 2 years ago

From what you said, I think we're on the same page.

Forward thinking, I was also going to propose an address based Cell type (partial to PCell, for pointer). The same cell type could actually support two owner types - one based on Box, and one based on Pin. It would have the downside of being a usize instead of a u32, but it would have a couple upsides: offloading the global state to the allocator implementation (instead of having explicit static globals), and it would work easily on no_std (the owner type based on Box would require alloc, the Pin version wouldn't require anything).

I'll throw some other comments about specific no_std stuff in the no_std issue.

uazu commented 2 years ago

Actually I think using usize for the ID would be fine. The u32 will often occupy a usize anyway in the QCell due to alignment. The reason I chose u32 was so that on 64-bit platforms (where most people code and test) the behaviour of the IDs would be the same as on 32-bit. But if it was using a pointer to allocated memory as the source of the unique ID then that doesn't matter, and it can be a usize.

So one possible path forward is to have QCellOwner contain a Box<usize> or similar to get a unique ID, which wouldn't be a breaking change in the API. As you say that offloads the unique check to the allocator, which reduces complexity. We can discuss PCell etc separately.

geeklint commented 2 years ago

Oh, if changing the size of the QCell id is acceptable, this PR can be completely superceded by that, and there'd be no need for a separate PCell type. I can mock up what that would look like this evening.

uazu commented 2 years ago

Okay, that sounds good. Thanks!

geeklint commented 2 years ago

The only issue would be it mostly precludes the current fast_new behavior. We could keep some of it by using a static array, although if the array is big I think it'll contribute to binary size[1], and the smaller we make it the more likely the "super unlikely" accidental cell ID collision safety issue becomes.

Now, the proposed alternative Pin owner would likely be faster than the existing fast_new, (and safer) but removing the existing fast_new would be a breaking change.

[1] It'll likely never get paged in, so it's probably not a issue for runtime memory consumption, but my experiment just now indicated it does contribute to binary size. But also that if the user never calls fast_new, the compiler will almost certainly optimize it away.

uazu commented 2 years ago

Okay, interesting! I wasn't thinking of trying to preserve fast_new behaviour. However if it seems worthwhile then we can just start it off on an odd number and increase by two each time. Then it will not clash with any values obtained from pointers to usize. (Actually even if it just worked as present, and we risk clashes with values obtained from addresses, that does not break unsafe conventions, because whoever calls an unsafe function has to take part of the responsibility to ensure that invariants are maintained, even if the clash is with some other safe function.)

geeklint commented 2 years ago

right, I mean theoretically fast_new could actually just return the same ID every time - it is an unsafe function after all. But I like the idea of having the allocations point to a type with an alignment greater than 2, and incrementing on odd numbers. The pointed-to doesn't have to be a usize though - I'll probably just make it a u16 or something.

uazu commented 2 years ago

Yes, fine -- so long as it has non-zero size to guarantee that we get a unique address. Depending on the allocator, the allocations get rounded up to whatever smallest unit that allocator can manage, so it probably doesn't make a lot of difference between u16 and usize.

uazu / qcell

Use crossbeam-queue for qcell free list #22