Open Josh-Diamond opened 1 year ago
@Josh-Diamond What extra information is provided by displaying the non-formatted text suggested?
@kwwii the non-formated text suggestion came from AKS Operator logs, and instead of displaying, This resource is currently in a transitioning state, but there isn't a detailed message available
, my request is to replace this message w/ the more specific log coming from AKS Operator, to read Waiting for cluster [c-REDACTED] to update node pool [REDACTED]
.
w.r.t. What extra information is provided by displaying the non-formatted text suggested?
The specific node pool being updated. If we have access to more specific details, we should pass them along to user w/o them having to navigate off Cluster Details page. From QA perspective this is a bug.
Thanks, I agree that we should give the user as much information as possible
This should be transferred to backend to provide the information via the operator.
I will look on this
It is needs to be discussed with UI team,
@Josh-Diamond is there a respective JIRA issue ?
@gaktive should I create a dashboard issue or will you ?
@kkaempf You have the instant context so I'll allow you to file :)
@kkaempf @gaktive I don't think this is one for the UI team (https://github.com/rancher/dashboard/issues/8877#issuecomment-1547496221). Whoever owns the provisioning.cattle.io.cluster
metadata.state.message
should take this on
Hi @richard-cox. @mjura and I investigated how to implement a proper messaging strategy during update and we think using the aksclusterconfigs.aks.cattle.io
object would be the most sensible solution. We are already using this resource to write error messages and it is equivalent to what we need to implement for non-error messages. Is there a reason for using provisioning.cattle.io.cluster
instead? This would add complexity to the operator messaging, would mean using different strategies for different types of messages and adds new dependencies to the code.
Would it be possible to consider moving to aksclusterconfigs.aks.cattle.io
?
The context of the cluster is the prov cluster resource itself. So when viewing a prov cluster or, importantly a list of prov clusters it's status comes from itself rather than a secondary resource.
We've hit a lot of issues in the past with secondary resources, specifically how they affect scaling. They will also break the new pagination process coming in soon.
Hi @richard-cox, error messages are currently being updated using this secondary resource aksclusterconfigs.aks.cattle.io
(same for the other hosted providers: EKS and GKE). Using this for information messages would align with what's being used in the operator code and would make it easily extensible to other hosted providers. Are there other controllers that are using provisioning.cattle.io.cluster
for updating messages?
I'm not sure about controllers, however all clusters that are shown in the UI look at the provisioning cluster for state and state message.
Putting status/ui-blocked
on this since this ties into https://github.com/rancher/dashboard/issues/8877 and the label is in use on rancher/rancher
for tracking UI issues.
Despite this comment UI team believes that this must be fixed in aks-operator.
After discussing what's the best way to tackle this with @mjura, we consider this should wait to have the new Rancher API available to be implemented. As we don't know when it will be available yet, we think it is best to move this back to Q2, considering it is not a high priority issue.
When you create a hosted cluster (e.g. AKS), although the cluster.provisioning.cattle.io
does get created, the cluster.management.cattle.io
is the tertiary object, and appears to be what the UI displays statuses from. Based on my testing, when creating on of these clusters, the Updated
condition is not present on the provisioning cluster object, which leads me to believe the UI is pulling from the management cluster object; this is also reinforced by the fact that the management cluster object is one manipulated during cluster creation/update. Looking in the API, I see the following for the management cluster object:
"state": {
"error": false,
"message": "",
"name": "provisioning",
"transitioning": true
},
and the following for the provisioning cluster object:
state": {
"error": false,
"message": "Waiting for API to be available",
"name": "waiting",
"transitioning": true
},
With the following showing on the UI:
I think in this case, it seems like the correct path for the hosted team would be to set the corresponding condition on the management cluster, be it Provisioned
or Updated
depending on whether the cluster has been Provisioned
successfully yet, since the message of the condition is ultimately what drives the metadata.state.message
from the v1 API. @richard-cox feel free to correct me if any of the above information is incorrect. I may be a bit fuzzy regarding how the UI pulls metadata.state.message
when relationships are involved in the v1 API.
Good spot @jakefhyde. I've investigated some more, interestingly it's different given the type of cluster
For tracking
_stateObj
contains the logic to decide between prov or mgmt cluster state objectstateDescription
is the text shown in cluster listsbanner
is the text shown in the cluster detail page's bannercc @nwmac
@kkaempf since this isn't going to be in 2.9.1, should we change the milestone to the next minor version 2.10.0 and then assess? @mjura is chiming in that the CRD used to source messages needs to be changed.
Since @mjura is already on it, I moved it to 2.9-Next1
Michal, feel free to move to a different milestone as you see fit 😉
@richard-cox The fact we use mgmt cluster is most likely historical - I thought we would prefer to move towards using the provisioning cluster where possible and ultimately remove the need for the UI to request the mgmt cluster. Would it not be better for the hosted operators to update the provisioning cluster and for the UI to be updated to look there?
Since @mjura is already on it, I moved it to 2.9-Next1
We talked with Gary and I moved it to v2.10, due it will be needed UI help, it this issue has low priority
@richard-cox @mjura & @jakefhyde can you sync on this and determine the next step to take?
@nwmac to follow up with @mattfarina on the nomenclature used for what a cluster here is in this context, since that would drive some of the UI around here.
SURE-7333
Setup
2.7.0
Chrome
Describe the bug
Cluster Management > Cluster details
page does not display accurate message.To Reproduce
2.7.0
Cluster Management > Cluster Details
pageResult
This resource is currently in a transitioning state, but there isn't a detailed message available.
Expected Result
time="2023-03-07T02:10:50Z" level=info msg="Waiting for cluster [REDACTED] to update node pool [REDACTED]"
Screenshots