Question: Resource limits

olirice commented 1 year ago

is there any possibility of adding settings for resource consumption similar to plv8's "memory_limit"?

workingjubilee commented 1 year ago

Do you have a particular resource in mind? v8 can do so without impeding program usage because it can stop when it hits that allocation limit and run a GC cycle in order to reclaim memory, tanking time efficiency in order to obtain space efficiency. But a Rust program that runs up against an allocation limit like that is simply going to die. So it's possible, and I can think of several ways of doing so, but is that actually what you want to see happen?

olirice commented 1 year ago

Rust program that runs up against an allocation limit like that is simply going to die. So it's possible, and I can think of several ways of doing so, but is that actually what you want to see happen?

yes, that's what I was thinking. fail and throw a resource error of some kind

the goals would be to fail fast on under-provisioned hardware and prevent the system from becoming sluggish/unresponsive

workingjubilee commented 1 year ago

If it's a global hardware resource limit then the Rust procedure will simply hit it and die because Rust programs largely do not handle the allocation failure case (because it is not meaningfully something they can handle). When it dies, the Rust procedure will begin unwinding and destroy all its memory usage, and then announce its death to the PL/Rust language handler, which will shrug and carry on, doing something else. When the Rust procedure ends, all its memory should be freed, either way. So I do not see how that is not "failing fast", basically.

Adding a tighter resource limit is possible but allows much more "thrash" in resource usage, as the system keeps executing PL/Rust procedures over and over that all keep failing. That seems counterproductive? If you are noticing memory leaks across transactions, do let us know, of course, I would want to fix such, as there should be very minimal ongoing resource usage after the initial startup of the PL/Rust language handler. But not zero. The language handler needs a few data structures to keep track of the functions it has compiled, loaded, and executed.

eeeebbbbrrrr commented 1 year ago

There might come a time when we can teach plrust to use Postgres as the rust GlobalAllocator -- that would mean each individual plrust function, not necessarily the extension itself. Then we'd be able to have some kind of global "PL/Rust Memory Context". Perhaps per-function or globally on the backend maybe. Just thinking out loud here.

I think we're a little away from that as we need to sort out some lifetime tracking issues within pgrx. The lifetime issues are things we're actively working on now, however.

I dunno if @workingjubilee is thinking of some other kind of "counting" allocator or something that could just panic at some hard limit. I think a lot of this kind of thing is complicated by the fact each plrust function has no knowledge of its surroundings.

olirice commented 1 year ago

all its memory should be freed, either way. So I do not see how that is not "failing fast", basically.

yep that makes sense. I disabled swap and see the behavior your describing

but with swap enabled, if memory use spills onto swap during compilation on low power machines they (and postgres) become unresponsive. that's the main reason I'm interested in setting a threshold

eeeebbbbrrrr commented 1 year ago

that's the main reason I'm interested in setting a threshold

Maybe this is more about ulimit or control groups?

workingjubilee commented 1 year ago

Yes, this is a problem that would affect the entire Postgres instance if the concern is about swapping disk, so solving it at the PL/Rust level does not seem appropriate for that, unless one is okay with a Postgres user's normal SQL queries triggering "slow the system unto unresponsiveness" behavior but not the PL/Rust procedures.

soedirgo commented 1 year ago

@olirice IIUC the failure is on compiling the function, not during the execution of the plrust function itself, is that right? So we'd be applying the resource constraints on cargo invocations, using cgroups or otherwise.

The failure during function execution, where the function panics and unwinds, looks like expected behavior to me and not cause for concern in this case.

tcdi / plrust

Question: Resource limits #368