Supporting TCell without std or exclusion-set

khoover commented 7 months ago

I believe qcell can support TCell without either of those features enabled, via refactoring the TCellOwner into an unsafe trait with most of the behaviour plus the current struct implementing this new trait. Then you can introduce a simple macro for creating new owner types that uses a static AtomicBool for uniqueness. Something like

pub unsafe trait CanOwnTCell {
  fn ro<'a, T: ?Sized>(&'a self, tc: &'a TCell<Self, T>) -> &'a T {
    // Copy implementation over
  }

  // All the other cell-related methods
}

unsafe impl<Q> CanOwnTCell for TCellOwner<Q> {}

macro_rules! make_tcellowner_type {
  ($visibility:vis, $owner:ident, $bool_name:ident) => {
    static $bool_name: AtomicBool = AtomicBool::new(false);

    $visibility struct $owner {
      _phantom: PhantomData<()>
    }

    impl $owner {
      pub fn try_new() -> Option<Self> {
        $bool_name.compare_exchange(false, true, Relaxed, Relaxed).ok().map(|_| Self { _phantom: Default::default() })
      }
    }

    unsafe impl CanOwnTCell for $owner {}

    impl Drop for $owner {
      fn drop(&mut self) {
        $bool_name.store(false, Relaxed);
      }
    }
  }
}

This would be a breaking change, in that now TCell has to specify the full TCellOwner<Q> as the owner type, but if you keep the old TCell<Q, T> as a type alias type TCell<Q, T> = TCellTraited<TCellOwner<Q>, T>; I think you can downgrade it to just a major change.

uazu commented 7 months ago

This doesn't stop you creating two owners (two AtomicBool instances) for the same type. If you have two owners then you could get two mutable references to the same memory, which is a safety violation. So I don't think this will work. However let me know if I've misunderstood what you're proposing.

khoover commented 7 months ago

So, two things here: the unsafe trait, and the macro.

The unsafe trait means it's unsafe to implement. The unsafe guarantee you'd want on implementors is that there is no safe way to have multiple instances of the type alive simultaneously during execution, i.e. if someone implements it for a struct that can safely have multiple instances alive, it's their fault for implementing it on the type. It would be fine to safely to construct, drop, then construct again later.

The macro is supposed to create a new static AtomicBool for each new owner type, with the idea being it'll collide if you try and, e.g., invoke it twice with the same struct. It needs a bit of work to be 100% safe; for one, it should probably create an inline module to put the struct in, something like

mod $owner { ... }

$visibility use $owner::$owner;

khoover commented 7 months ago

A motivating example using stakker; in single-stakker, TCell-using mode, instead of having the two checks for the Actor and Share owners, you can construct them simultaneously checked on a single static AtomicBool, where safety is guaranteed by the types only being built in Stakker::new after checking the static var (or by combining new_share_cell_owner and new_actor_cell_owner together).

uazu commented 7 months ago

Yes, there are various details that may need tuning and so on, but first I need to think through the central idea. Unfortunately my brain after Covid has ups and downs so I need to wait for the conditions to be right to think it through properly. Also, I need to do some checks. For example, whether the unsafe within the macro expansion will break when used under #[forbid(unsafe_code)] in the calling crate. It may need #[allow_internal_unsafe] but that appears to be for internal use by std.

uazu commented 7 months ago

Having an unsafe within a macro seems to work okay under #[forbid(unsafe_code)]. I think this could work. Right now we have wait_for_new, so we'd also need a Condvar, but maybe that could remain shared.

I think the whole thing could be reduced down to an additional restriction on the marker type, i.e. Q: Marker where Marker is an unsafe trait using your trick of having a macro to implement it. Marker would have a call to get access to the atomic static variable. The rest could be built on that.

This is because the only thing that we need is one static variable per static type. I looked at this before, but since static variables don't get created one-per-monomorphization, that seemed like a dead end. Actually the macro could just "bless" an existing marker type by adding the Marker implementation and static atomic. I think this could be made clean and tidy enough.

Creating (or blessing) the marker type with a macro is a change in the API, so a breaking change. The question is whether this is worth the trouble. It does remove a lot of internal complexity with the different std/no-std implementations. Speed isn't the issue here because we expect an owner to be long-lived. However a reduction in complexity is good. Also, for no-std, having everything static brings advantages. I will think more about it.

khoover commented 7 months ago

It's a bit yes-and-no, since if you make the marker something the type parameter to TCellOwner needs to implement, then it becomes harder to, e.g., aggregate a bunch of owner markers together into a single check. Plus now the change is forced to be breaking, whereas the other version of adding an unsafe trait that the type parameter in TCell needs to implement can be done without breaking.

But, it's version 0.x, so hey.

uazu commented 7 months ago

I don't want to aggregate multiple owners into a single check. That seems like an unnecessary optimisation. Do you have a use-case where you expect owners to be recreated frequently in pairs?

Okay, I'm finally understanding what you're proposing. So you're saying that things will still work because the new owner creation is independent from TCellOwner? Okay, there is something broken in your approach because anything that implements CanOwnTCell owns all TCells. So if there's more than one owner based on different marker types, those marker types get lost as far as the type system is concerned and two owners can get mutable access to the same TCell, breaking safety. Right now that is not possible. So you need to get Q into the CanOwnTCell trait so that carries through. (Really this all needs implementing and testing to verify that the trait-based TCell performs correctly.)

I wonder whether you see some other advantage of your CanOwnTCell approach? For example whether you see there being various different kinds of owner in the future. In which case, like what?

For the approach of attaching the atomic to the marker type, this is what I've got it down to on the user-side:

unsafe trait Marker {
    fn atomic_ref() -> &'static AtomicBool;
}

macro_rules! impl_marker {
    ($t:ty) => {
        unsafe impl Marker for $t {
            fn atomic_ref() -> &'static AtomicBool {
                static ATOMIC: AtomicBool = AtomicBool::new(false);
                &ATOMIC
            }
        }
    };
}

struct Test;
impl_marker!(Test);

This is a pretty minimal change, but means that the TCellOwner can get hold of the atomic and do its checks using that instead of using the hashset/whatever. As I said, that reduces complexity and is cleaner for no-std.

khoover commented 7 months ago

Okay, there is something broken in your approach because anything that implements CanOwnTCell owns all TCells.

No, it's the same type checking mechanism TCell is currently based on. Look at the trait method, the TCell you're trying to access has to specifically be owned by Self, which is the type of whatever is implementing the trait. The same would be true in the TCell code, it would require an &Q or &mut Q of the specific CanOwnTCell-implementor type.

I do think your version of adding, e.g., a try_new_from_marker to TCellOwner instead probably works out nicer though. Although, you'd need some way of telling whether you used the exclusion set/mutex or this atomic bool from the drop handler.

khoover commented 7 months ago

Thinking a bit more, the atomic_ref function has to be unsafe as well, with the guarantee that only the qcell crate is allowed to call it (or that the qcell crate is the only crate allowed to modify the returned AtomicBool). Otherwise it's a public trait, anyone could call atomic_ref and do whatever they like to the AtomicBool.

uazu commented 7 months ago

Okay, regarding whether or not CanOwnTCell needs a generic <Q> parameter, I think this would just need testing to be sure, so that isn't a blocker.

Yes, it makes sense what you're saying about making atomic_ref unsafe.

I want to keep this crate minimal, or as minimal as possible to cover the required use-cases. So not just adding more and more features if they do the same thing. So ideally if I added this atomic-based code, I'd want to strip out the old code. An atomic connected to the marker could get rid of all the HashSet/exclusion-set code, reducing complexity and increasing speed, but requires a small code change for callers.

However, I realize that using an atomic is not compatible with using a Condvar to support wait_for_new, so that kind of breaks things. (There's an unavoidable race between failing the atomic check and attempting something with Condvar if there isn't a mutex involved to start with.)

How did you anticipate handling the case where the owner cannot be created because there is another owner? (i.e. in a real life application, what would you do?)

It may be that using an atomic has uses as a special case, rather than general-purpose. For the cases where you need the speed, you're probably creating and destroying owners frequently, maybe across threads, in which case you'd want to have a way to handle wait_for_new. I guess the static could be a Mutex<bool>, but that's no longer no-std friendly, so we still need exclusion-set. Actually if we're using a mutex to allow frequent access to the owner from different places, it's equivalent to using a Mutex<TCellOwner> with the current code.

It's a shame this doesn't seem to be working out to be a general-purpose solution. Do you still have a use-case for the atomic approach despite the issues?

khoover commented 7 months ago

However, I realize that using an atomic is not compatible with using a Condvar to support wait_for_new,

Why isn't it? I don't see anything in CondVar that precludes the degenerate case of, e.g., creating a Mutex<()> local to lock right before waiting on a shared condvar? Then you would use wait_while, with the while condition being failing to CAS the bool to true. I.e.

static CONDVAR: std::sync::Condvar = std::sync::Condvar::new();

impl<Q: Marker> TCellOwner<Q> {
  ...
  pub fn wait_for_new() -> Self {
    let dummy_mutex = std::sync::Mutex::new(());
    let guard = dummy_mutex.lock();
    let _ = CONDVAR.wait_while(guard, |_| {
      let flag = unsafe { Q::atomic_ref() };
      flag.compare_exchange(false, true, Relaxed, Relaxed).is_err()
    }).unwrap();
    // At this point, the compare_exchange has returned Ok, so we set
    // the flag and have sole ownership.
    Self { ty: PhantomData }
  }
}

EDIT: actually, that Mutex can probably be a static as well, since we're just going to drop the guard immediately (i.e. it's just there so its guard can be passed to the condvar). Depends on how expensive the new/drop is vs. high-contention locking, and whether high contention is the expected case.

khoover commented 7 months ago

Also, idea on how to make atomic_ref safe; have it return an opaque struct that's a transparent wrapper around &'static AtomicBool. Then only qcell is able to unwrap the struct and see the bool, and we make the safety conditions for the trait be:

The bool cannot be referenced outside of the atomic_ref call
The bool cannot be modified by the atomic_ref call
It must be the same &'static AtomicBool returned by each call.

uazu commented 7 months ago

The race I was imagining goes like this:

Thread 1 sets the atomic bool
Thread 2 tries to set the atomic bool and fails
Thread 1 resets the atomic bool, and signals the CondVar to wake anyone waiting
Thread 2 grabs the mutex and waits on the CondVar, and ends up waiting forever

However if you recheck the atomic bool whilst holding the shared mutex then yes maybe that race is avoided, although I'm not totally sure. But anyway, now things are getting as complicated as the current solution so I don't see the advantage.

I also had a look at the exclusion-set solution, which is no-std. It seems like that implements its own CondVar-like scheme using thread::park and Thread::unpark. But that is also quite complicated.

I'm a fan of reducing complexity, but I don't think this is working out any better.

khoover commented 7 months ago

Maybe instead of a AtomicBool, the marker specifies a RawMutex impl + static instance the TCellOwner should use? wait_for_new() then just becomes Q::mutex_instance().lock(), try_new() is try_lock(), and drop() is unlock(). Then the macro can use something like the raw spinlock example, or you can have different macros ("pick your expected contention").

EDIT: A working example of this in action.

uazu commented 5 months ago

Coming back to this one again. I think we've established that it's viable to change the API to keep a mutex or whatever, one per marker type. It is a breaking change but it looks viable. This removes the need to have exclusion_set or a HashMap or whatever to keep track of types in use. If the advantages and simplifications were significant, then I'd consider making a breaking change for it.

However, I don't see a significant advantage in making the change. It isn't much of a simplification. The code in question is not performance-critical. (It is called once on creation of the owner -- that is all.) In addition the code we have right now is working and tested. So I don't think this change is worth doing right now. However I think if in the future I saw a genuine real-world use-case that absolutely required this, then that might give it a lot more weight and I would reconsider what we've discussed here.

uazu / qcell

Supporting TCell without std or exclusion-set #48