Open fbalak opened 7 years ago
@fbalak It would be really helpful if you can provide the API logs when it was in error state.
@anivargi Sure, here is /var/log/messages
from server where the api is run:
messages-server.txt
If you want access to the machine, I can grant it to you if you send me ssh public key.
@fbalak can you share /var/log/tendrl/api/error.log
@r0h4n @shtripat looks like TendrlContext
is temporarily empty even if the cluster_id
is present, which causes the API to fail, I can add a check for that, but does this need to be handled by the backend?
Assuming you have not imported this cluster, tendrl-node-agent will start syncing all the data after the default "sync_interval" (as given in node-agent.conf.yaml). This is expected behavior and the UI and API should respond appropriately before the first sync completes
If the cluster is a Tendrl managed cluster, then we need to investigate further.
@fbalak
@anivargi @r0h4n I have been testing this and from the UI point of view it looks good now:
I am just concerned that the API call still returns 500 Internal Server Error
and it provide no information for user about what is happening. If we know where the problem is in the backend can you please update error message and exit code? I would suggest 503 Service Unavailable
.
After discussion with @dahorak we came to conclusion that there should not be any 5xx error. It would be best to return not entirely imported cluster in response and add some loading API attribute that would be reflected in UI but It would also make sense to return code 202 Accepted and return empty list.
Based on discussion with @r0h4n today, I would propose the following.
Since Tendrl needs to wait for all node agents to send all data required to describe the cluster, the cluster can't be shown immediately. Only when Tendrl has all the data, it makes sense to return the cluster in the api/1.0/clusters
api call.
Unfortunately there are few problems with the current behavior:
I don't think it makes sense to return not fully detected cluster in clusters list for multiple reasons:
So one possible approach would be to return empty list in the clusters call response until the cluster is fully detected. And then to let the user know about cluster detecting taking place, tendrl could send the notification about this.
I searched in the design documents and the ui design doesn't anticipate with any delay in detection of the cluster for import: https://redhat.invisionapp.com/share/8QCOEVEY9#/screens/244820318
@julienlim How would you suggest to address this delay in cluster detection? Would the notification as suggester be ok?
@fbalak @dahorak I don't think that returning 202 Accepted makes sense, as the cluster detection is not triggered by clusters
api call.
@fbalak I disagree with your description in https://github.com/Tendrl/api/issues/306#issuecomment-336105767 - showing spinner during cluster detection looks much more reasonable compared current 500 status code which is ignored by the ui. Unfortunately it seems that the 500 status code was misused to indicate that the cluster detection is running, which is not ok.
@mbukatov I agree with you, I suggested @fbalak 200 OK with empty clusters list (if it is not possible to mark cluster as not fully detected).
Current takeaway on answers we need to the following questions:
@Tendrl/tendrl-qe @rghatvis@redhat.com @mcarrano @julienlim
Thinking about my previous suggestion to use notification: sending notification that tendrl is detecting a cluster makes sense from backend perspective, but could we use this notification to control when to show spinner in the ui?
@Tendrl/tendrl-core do you guys think we should add this right away or wait for the next release?
More updates on this issue discussed in today's daily meeting
@Tendrl/tendrl-core @Tendrl/tendrl-qe @mcarrano @r0h4n
Here's our recommendation from a UI perspective:
(1) When there is only 1 cluster that is being detected (but not yet ready for importing), show the following blank slate:
Title: Cluster detected Summary: Tender server is connecting to the Tendrl node agents to detect the cluster resources.
(2) When there are multiple clusters and a detected cluster shows up in the Cluster list (but is not yet ready for Import), in the column before the button, show the following message “Cluster detection in progress” and disable Import button. When cluster detection has completed, then enable the “Import” button. Currently, the status of the cluster prior to being imported shows a “?” which is probably still okay to leave as-is.
If there's any other scenarios that I've not addressed, please advise and we can come up with other recommendations accordingly.
@julienlim This issue is pretty old. Is it still persisting for you?
I have cluster with 4 nodes. I have installed tendrl and after all was installed I had to wait about 2 minutes before I was able to use tendrl ui because List Clusters API call was returning
500 Internal Server Error
. Most of other API calls were working correctly. (I was able to log in and navigate through sites)After 2 minutes the site and API call started working correctly.
This is how it looks right after installation:
Tested with: