jonhoo / haphazard

Hazard pointers in Rust.
Apache License 2.0
197 stars 26 forks source link

`unique_domain!` macro can be used to generate multiple domains with the same family #54

Open the10thWiz opened 1 year ago

the10thWiz commented 1 year ago

unique_domain! will generate a unique type for every macro invocation, however, if the resulting code is executed multiple times, multiple domains with the same family can be created. This is trivially possible if the code is placed inside a function or loop, since the macro is only expanded once, but the resulting code is executed multiple times.

let mut v = vec![];
for _ in 0..2 {
    v.push(unique_domain!());
}

I don't see an easy solution here, esp. not at compile time.

I don't see any way to force a macro invocation to only be executed once at compile time, so we will need either need to find a way to return the same domain each time (i.e. the example will generate two pointers to the same pointer), or error (at runtime) when the generated code is called a second time.

  1. We could trivially add an internal static AtomicBool to track whether a given macro has been called multiple times. This is not ideal, since it means using unique_domain! inside a function, such as a constructor, is no longer possible.
  2. If we are okay with just always returning the same domain, (i.e. the above example would generate a vector of pointers to the same domain), we can make this work better. This has a few downsides - it forces multiple instances of data structures that use a custom internal domain to share the domain. We should also likely wrap the shared Domain in a custom Arc-like wrapper, which triggers the domain to eagerly reclaim retired pointers when the domain pointer is dropped.

The first option is restrictive, and will cause certain use-cases to require unsafe code. (The current implementation can cause UB in these cases). The second has performance implications, but should be safer. It's trivially possible to make the value returned by unique_domain! a static reference, but the implications of this need to be explored before we settle on an option for this.

jonhoo commented 11 months ago

Ooof, yeah, in a sense what we're trying to do here (get a singleton domain on each invocation) is flat-out impossible. It would require that we generate a different type each time unique_domain! is invoked, but repeated-invocations are a runtime thing, and types are a compile-time thing.

I don't really love any of the options here. If the most common use-case of this is to use in the constructor of a data structure where the constructor uses the resulting type to return a Self<impl Singleton>, then both of these options are kind of bad. Option 1 makes it so that you can't call the constructor more than once. And option 2 makes it so that retiring in any one instance of the data structure need to check the hazard pointers of all instances of the data structure. 2 is better than 1, but 1 is also a real foot-gun for performance.

This all makes me wonder if we should either take option 1 to nudge data-structure authors to require a singleton family to be passed in, or if we should just remove the macro all-together.

What do you think?

the10thWiz commented 11 months ago

I think taking option 1, with a note suggesting data-structure authors should also re-export the unique_domain! macro is the simplest option. Authors who wish to take the second option can use the static_unique_domain! macro I've previously implemented, which has the same performance characteristics as option 2.

We could also explore adding some kind of 'tagged' domain in the future, which provides the same safety as the current Singleton domains, but checks at runtime. At this point, I think we should recommend data-structure authors either use a non-singleton custom domain (and just bite the bullet on using unsafe code), or require the user to provide a custom domain.

the10thWiz commented 11 months ago

After writing an implementation of option 1, and exploring the API a bit, I think it's the right way forward. To that end, I'd like to suggest another improvement, which I anticipate submitting as a separate PR: something like Cow<Domain<F>>, which would provide a convenient way for a data structure to hold either a &'static Domain<F>, or a Domain<F>. This would make passing a domain into a data structure easier to write, and hopefully should look something like:

struct DS<F: 'static> { d: DomainRef<F>, /* ... */ }

impl<F: 'static> DS<F> {
    fn new_in_domain(domain: impl Into<DomainRef<F>>) -> Self {
        Self { d: domain.into(), /* ... */ }
    }
}

fn main() {
    let global = DS::new_in_domain(&Global);
    let unique = DS::new_in_domain(unique_domain!());
    static_unique_domain!(static_domain: Domain<Family>);
    let static_ = DS::new_in_domain(&static_domain);
}
jonhoo commented 11 months ago

Ah, that's interesting. Is there a particular advantage to us shipping that as a type rather than just mentioning in docs somewhere that data structure authors may wish to store Cow<'static, Domain> internally to allow either kind of domain to be used?

the10thWiz commented 11 months ago

Cow<'static, Domain> doesn't work, since it would require Domain to implement Clone. That's the only reason it would be worthwhile to create our own type.

jonhoo commented 10 months ago

Ah, specifically because Cow<B> requires B: ToOwned in its definition (not just in methods). That's awkward. Sure, then I can see the value of a type like that!