ClusterLabs / OCF-spec

http://standards.clusterlabs.org
20 stars 11 forks source link

RFC: OCF profiles #23

Closed jnpkrn closed 3 years ago

jnpkrn commented 5 years ago

See the commit messages + commits themselves.

Good way to close the gap between theory (OCF) and practice (pacemaker) without hurting anyone possibly using the standard as well?

And to allow for sets of functionality to be gradually added without relying on a single-number (despite multi-dimensional space) serialization?

Isn't this something akin to how, e.g., Java ecosystem works (atomic units of APIs to be supported in full or not at all)?

jnpkrn commented 5 years ago

If/when this framework gets adoption, we can talk about other profiles.

I have one very specific on mind, stackable-1, that would help along the lines I mentioned at the recent problematic configuration pattern discovery:

https://github.com/ClusterLabs/resource-agents/issues/1304#issuecomment-473525495

and which could get us what was good about rgmanager (more controlled and less error-prone combinability of particular resources within hierarchical arrangements) back to pacemaker, for instance.

jnpkrn commented 5 years ago

Also quite interesting conclusions that might arise.

For one, an agent supporting whichever OCF version that introduces profiles with some normative high-level semantics (yet to be added), when not supporting clonable-X like profile, should be actively prevented from being configured as a clone.

Observe how currently, nothing prevents this yet it's about to incur sufferings in some cases, as often times, the actual role of the RM is not only to guarantee a living instance within cluster (aligned with HA), but moreover at most single living instance within cluster (a.k.a. mutual exclusion), and some agents so far silently require the latter. Take, for example, ocf:heartbeat:IPaddr2, mentally drop the active (whereas it could be rather passive when there's a support in RM per these optional profiles) support for pacemaker's cloned resource style of running the agent, and you'll get something that cannot be cloned/node-parallelized by definition (the default of respective OCF standard introducing profiles). Conversely, only agents of given OCF version explicitly declaring themselves as supporting clonable-X like profile would be allowed in configurations to that effect (if particular RM supports the multi-node notion at all, which is not a strict requirement, i.e., simplistic local/non-distributed runners of OCF agents cannot be excluded, and these would apparently not support this clonable-X like profile).

jnpkrn commented 5 years ago

The other conclusion could be that unique can be kept to annotate parameters as it was a spot-on naming that would be sad to ditch just because of pacemaker being an elephant in the room (and pacemaker could learn to treat 1.1 conformant agents the right, intended way), and for reloadable stuff, another profile, reloadable-0 would be devised and used by the agents that implement online reconfiguration. That's a counter-proposal to this very part of #21.


Also realized that I wasn't entirely correct so far, repurposable-0 is not a specialization of clonable-0, but rather a fully orthogonal profile. Agents currently in accordance with the pacemaker's use of promotable clones would require both clonable-0 and promotable-0 -- specialization of repirposable-0 with only two clearly distinguished roles -- profiles at once.

Another thing, it may be quite common for clonable-0 resource agent to only support a single instance per node at maximum, and if there is a whole class of these resources (e.g. controld amongst them), it might make sense to devise actual clonable-0 specialization, clonable-singleton-0, which would take away some configurables from the profile for being constant (maximum level of instances per node, otherwise governed with agent's profile configuration with plain clonable-0 profile -- at this point I am still referring to non-existing specifications currently only living in my head, but there would be a mechanism to provide additional profile-specific variables along with expressing its support in the meta data).


Overall, I think we could restore soundness (dubious these days for various isolated, proprietary extensions) of the OCF as a materialized agreement between providers (agents) and the consumers (resource managers).

jnpkrn commented 5 years ago

Note that it appears there are way too many synergies available with this modular approach it would be very ill-advised not to go that route. It may be expensive in terms of one-off kicking this off, but the reality proves us a strict monolith is prohibitively expensive the whole time even if the changes to be made are really opt-in, self-contained extensions.

For instance, finally, the Linux kernel received something assuring regarding the PID reuse problem:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3eb39f47934f9d5a3027fe00d906a45fe3a15fad

It was brought up multiple times on lists for the past few years, e.g.:

https://oss.clusterlabs.org/pipermail/developers/2017-July/001098.html https://lists.clusterlabs.org/pipermail/users/2017-April/021957.html

Now, we can finally do something reasonable with the said provision in the kernel!

Possibly like this:

Now, IIUIC, there's a lot of related work so we can benefit even more from the pidfd abstraction in Linux, e.g., polling:

https://lwn.net/ml/linux-kernel/20190425190010.46489-1-joel@joelfernandes.org/

but we also need a modular, extensible way to formalize support for such mostly very optional extensions (here because it's strictly limited to a single system only) to the standard with the amount of administration proportional to the property of self-containment of these "modules". Full-blown all-decision-scope-equal approach doesn't really play on the agile tone -- observe how we were practically unable to move forward for how long? two decades?

Core + modules framework will hopefully make it easy to address all those long-term deficiencies in an effective way.

Also, think about how the high-level management tools could gradually adapt to new profiles (they would all of sudden start to refer to the "Add support for clonable-0 profile" instead of "Add support for pacemaker's \<clone>"; the shift is also that whenever there is any other resource manager supporting that profile, it could be fairly easy to adopt some abstract aspects of the profile support in them for it as well).

EDIT: renamed pacemaker-fdstored to pacemaker-dsctored, so that we have a-f range fully covered :-)

kgaillot commented 3 years ago

This is an interesting idea that may be useful one day, but I would rather wait until we have a strong need for such modularity before deciding the details, to ensure it's suitable to what becomes needed.