Closed majewsky closed 4 days ago
The quota set retrieves its information from the resource table, so the ceph endpoint should have some awareness to differentiate between AZ. Is that the case? Otherwise I agree on the Base quota assignment problem that you stated.
I don't think I understand the question. Let's follow up on this on the phone.
On your first point you mentioned a capacity report (I presume the request to the liquid and therefore to the service)
Reports are from the liquid to Limes; the opposite direction is a request.
Then just to clarify, what you mean that needs to be done here is to simply check if the resource is truly AZ aware or not during the scrape, right?
Yes, the intended change is that if the resource is AZ-aware, Limes ought to ignore the report for AZ any
, but not as a silent error.
I'm not quite sure what you mean by the mentioned db field in point 3. project_az_resources.backend_quota. Why would this apporach be necessary? Can you elaborate a little bit further?
Right now, liquids only report quota per project, not per project and AZ. This information is persisted in project_resources.backend_quota
, and used to decide when to force a sync of quota from Limes to the liquid. If AZ-aware quota is introduced, the same data needs to be retained on the AZ level.
This is becoming a requirement for liquid-ceph: We will have several storage classes grouped into a single resource as the resource's AZ slots. Using placeholder names for illustrative purposes, there might be several storage classes like:
I recommended against modelling those as separate AZ-unaware resources, so these will be grouped into a single resource "3 replicas in the same AZ". The main consequence from this is that LIQUID needs to add support for AZ-aware quotas (instead of just AZ-aware capacity and usage).
So an additional configuration is needed in
type ResourceInfo
. To avoid turning ResourceInfo into a sea of interconnected booleans, I'm introducing a new enumtype ResourceTopology
. The existing two behaviors are described byFlatResourceTopology
(no AZ-awareness, everything is in AZany
) andAZAwareResourceTopology
(AZ-awareness for usage and capacity, but not quota), and the new behavior isAZSeparatedResourceTopology
(AZ-awareness for usage and capacity and also quota). As part of this, the previously implicit differentiation between flat and AZ-aware topology becomes explicit now, in order for Limes to become able to act as a kind of linter for liquid behavior.To give a preview of the implementation scope for Limes, this means that:
AZAwareResourceTopology
, a capacity report for AZany
shall be rejected because only known AZs andunknown
are allowed for this topology. This is an easy change that is localized to the LIQUID plugin bridge. (In the future, we can think about using the topology to optimize algorithms inside Limes, but that probably won't be in scope for the initial work package. For example, thecommitment_is_az_aware
config flag can be replaced by a check for the selected topology.)AZSeparatedResourceTopology
. This is another easy change: We already have the AZ quotas in our DB, we just need to put them in the request.AZSeparatedResourceTopology
. This is a slightly larger change that involves addingproject_az_resources.backend_quota
to the DB schema, but not too bad, either.any
. But this does not make sense for the storage class scenario described above: Which distinct storage class would you give the quota for storage classany
? We cannot divide it between storage classes becauseSetQuota
only sees the numbers for one specific project and does not have any information about capacity distribution. My best idea right now is to treat the configured base quota as applying for each AZ separately forAZSeparatedResourceTopology
, but I'll let this problem simmer in my head a bit longer before deciding on a solution. What's clear is that we need some solution because base quota is going to be desirable for Ceph resources eventually.