pgcentralfoundation / pgrx

Build Postgres Extensions with Rust!
Other
3.66k stars 249 forks source link

Design sketch for potential new {alloc,memcxt,drop,borrow} abstractions #891

Open workingjubilee opened 1 year ago

workingjubilee commented 1 year ago

This issue serves as a public aggregation of my thoughts regarding, and sketchpad for possible new designs of, how we handle the allocations that flow between Rust and Postgres and back again. Many of these ideas are half-formed so I do not intend to respond much to nits here: if things seem weird, unrelated, or inconsistent, it's more likely they have a missing part that I haven't written yet that would fully elaborate them.

Pow<'a, B>

In std::borrow Rust has

pub enum Cow<'a, B>
where
    B: 'a + ToOwned + ?Sized,
{
    Borrowed(&'a B),
    Owned(<B as ToOwned>::Owned),
}

We probably want something like

pub enum Pow<'a, B>
where
    B: 'a + ToRust + ?Sized,
{
    Palloc(&'a B),
    Rust(<B as ToRust>::Owned),
}

In some cases this may allow us to reduce the number of type parameters we use, such as everything that is parameterized by the WhoAllocated<A> abstraction, when matching on the enum isn't going to be a significant overhead.

workingjubilee commented 1 year ago

We should also have abstractions that handle all of the needs_pfree cases we're doing "manually" in Rust here: these all should use a similar abstraction that's more like

pub enum MaybePalloc<'a, P>
where
    P: 'a + ?Sized,
{
    Borrowed(&'a P),
    Palloc(NonNull<P>),
}

Maybe that should be Pow and the other one should be Row, eh? :P

workingjubilee commented 1 year ago

These PRs have illustrated that we probably want to find a way to use a pattern to prevent Drop from interacting poorly with OwnedMemoryContext.

We also may want to find a way to further improve our panic handling with respect to memory contexts:

ccleve commented 1 year ago

I can't comment on the best way to implement things internally, but I can tell you what would make life easier.

#[repr(C)]
struct Foo {
    bar: String,
}

#[pg_guard]
pub extern "C" fn some_pg_func(var: pg_sys::SomePgStruct) -> Foo {
    let x = var.thing;
    let foo = alloc_in_current_mem_context::<Foo>();
    return foo; // no into_pg() required
}

If the foo.bar String got extended, the new memory would automatically be allocated in the same mem context.

I understand that doing it exactly as I've described is probably impossible, but the more we can get away from PgBox, from_pg(), into_pg(), and tracking whether something is in Rust or pg memory, the better.

Postgres manages memory lifetimes really well. Allocating everything in the current transaction memory context and then dropping it all at the end of the transaction covers a host of sins. I see no point in genuflecting before the Rust borrow checker and trying to get lifetimes exactly right. Everything allocated in pg memory has a static lifetime and never gets dropped.

workingjubilee commented 1 year ago

Not all extensions are entirely described by starting, executing a transaction, and then stopping. So speaking of static lifetimes, it is statics, actually, that are are one of my primary concerns. The issue is inevitably going to come down to something like this:

use once_cell::sync::Lazy;

// This allows interior mutability after initialization
// so it can be written by transaction N and read by transaction N+1
static GLOBAL: Lazy<Palloc<SomeType>> = Lazy::new(|| Palloc::move_in(SomeType {}));

struct SomeType {}
struct Palloc<T> {
    // Pretend this is a real type describing a palloc
    inner: T,
}

impl<T> Palloc<T> {
    fn move_in(inner: T) -> Self { Palloc { inner } }
}

"How do we design the API so that anything that gets up here must be in something like TopMemoryContext?"

In other words, 'static need only remain alive for the entire lifetime of the program, but PGX cannot assume that the lifetime of the transaction is also the lifetime of the extension as a program. I do of course want to improve ergonomics here and lean into what Postgres offers rather than away from it, but this is a nontrivial concern. Artificially limiting PGX to only being useful to create a certain "well-behaved" suite of extensions would be... unfortunate.

ccleve commented 1 year ago

You might consider implementing an API that looks like existing Rust arena allocators. Bumpalo is a good one. It's very intuitive. https://github.com/fitzgen/bumpalo