Open shtripat opened 6 years ago
@r0h4n @nthomas-redhat @julienlim @mbukatov plz check and ack. @mbukatov I have left place holders for adding tendrl-ansible details. Olz check and comment accordingly so that same can be added.
Please separate the detection and expand spec
@shtripat @mbukatov @r0h4n @nthomas-redhat
In reviewing https://github.com/Tendrl/specifications/pull/254/commits/50fca318d9620c4922047dbcf0a22fc41b54e308, here are some things to consider:
(1) what happens if tendrl-ansible fails to get the tendrl agent(s) -- i.e. node, collectd, gluster-integration -- properly installed and configured? Can we add some mention about how this is tackled, even if it's a small mention that this is all handled at the command line. [Basically a spec should account for error-handling to some degree.]
(2) can one assume all this gets logged somewhere in some tendrl log file(s) -- whether successful or failed so one can troubleshoot?
(3) The "Testing" section in the document I assume refers to the "Acceptance Criteria" should include some mention that the cluster's host list in the Tendrl UI as well as Grafana will reflect the additional nodes/hosts as well as related dashboard(s).
(4) Can we also mention if that this can be detection for expansion of 1 or more nodes and that is not limited to only 1 at a time, as the testing (or Acceptance Criteria) will need to allow for doing n number of nodes that get added to the Gluster trusted storage pool, i.e. work for 1 as well as work for multiple nodes.
(5) Acceptance Criteria should also mention if the Inventory file is the one updated or not (as I couldn't tell if we were going with the Inventory file or with something in-memory, the alternate option), but that could be because the implementation is not yet determined.
There were some minor typos but that can be easily fixed when the spec gets updated.
@r0h4n regarding Please separate the detection and expand spec
, I would start a separate specification for detection of new nodes in the gluster cluster.
@julienlim
(1) what happens if tendrl-ansible fails to get the tendrl agent(s) -- i.e. node, collectd, gluster integration -- properly installed and configured? Can we add some mention about how this is tackled, even if it's a small mention that this is all handled at the command line. [Basically a spec should account for error-handling to some degree.]
If tendrl-node-agent not found, ideally tendrl-ansible being a command line tool should report with detailed error. I would suggest a separate specification under tendrl-ansible for this. @r0h4n ?? The installation of collectd is as dependency with tendrl-node-agent. The installation of tendrl-gluster-integration is part of import cluster flow. So when import cluster would get executed for the new node in cluster, it would be installed. If there are errors around installation of tendrl-gluster-integration, that would be logged as part of flow. I can add some details like this in specification.
(2) can one assume all this gets logged somewhere in some tendrl log file(s) -- whether successful or failed so one can troubleshoot?
Anything related to tendrl-ansible should ideally with taken care as part of separate specification under tendrl-anisble. The details while import cluster flow would be updated in this specification.
(3) The "Testing" section in the document I assume refers to the "Acceptance Criteria" should include some mention that the cluster's host list in the Tendrl UI as well as Grafana will reflect the >> additional nodes/hosts as well as related dashboard(s).
Ack. I would add this acceptance criteria
(4) Can we also mention if that this can be detection for expansion of 1 or more nodes and that is >> not limited to only 1 at a time, as the testing (or Acceptance Criteria) will need to allow for doing n >> number of nodes that get added to the Gluster trusted storage pool, i.e. work for 1 as well as work >> for multiple nodes.
Ack. I will add this acceptance criteria.
(5) Acceptance Criteria should also mention if the Inventory file is the one updated or not (as I couldn't tell if we were going with the Inventory file or with something in-memory, the alternate option), but that could be because the implementation is not yet determined.
Ack. At the moment we would go ahead with inventory file and manual execution of tendrl-ansible from command line. So its like admin first modifies the inventory file for tendrl-ansible with entries for additional nodes and then runs it. This installs tendrl-node-agent on the new nodes and configures them to report to the existing tendrl server. Once this step gets over, automatic detection of nodes in tendrl server and then running import cluster flow for these additional nodes would be taken care by tendrl-server's node-agent service. I will add an acceptance criteria around inventory file of tendrl-ansible.
@shtripat Ack your comments. I'll assume you'll make the appropriate updates to the spec.
With regards to the following statement: "...So its like admin first modifies the inventory file for tendrl-ansible with entries for additional nodes and then runs it...."
I know if user created his/her own inventory file, probably user has to update it. However, if user is using the default /etc/ansible/hosts, should we consider adding the new entries directly there and save user the effort. This is probably a nice-to-have since we don't know yet how often users are creating his/her own inventory file or using the default Ansible hosts file.
@julienlim agree. At this moment we expect user to manually change the inventory file and run tendrl-ansible.
@julienlim @r0h4n added another issue https://github.com/Tendrl/specifications/issues/257 for specifying the detection part of new nodes in gluster cluster. @r0h4n as discussed please take care of this part of specification under a separate spec.
Updated the PR with more expected behavior details.
As per suggestions from @shtripat @r0h4n, the expansion of cluster will take place when cluster is in managed state and can be indicated as follows:
@julienlim @mcarrano @nthomas-redhat , what are you thoughts? Please update the design for the same.
@a2batic @shtripat @nthomas-redhat @r0h4n @Tendrl/qe
Thoughts?
@a2batic - We'll look into making design updates.
@julienlim @a2batic @gnehapk During expansion for the cluster, we should show the message as Expanding
with a View details
link to show the job details.
Also the kebab menu options like Enable/Disable profiling
and Unmanage
should be disabled for the cluster.
In case there is a failure in expansion, there would be event logged from backend. Also the View Details
link should remain and link to task failed details. In case of success the job link vanishes and message says ready to use
as usual for the cluster.
Thoughts??
Also while expansion of cluster going on the clutsrer-id should not be clickable and preferably a different icon for cluster status required. @julienlim ?
If I read the specification correctly tendrl notice a new node automatically and via some notification requires to run tendrl-ansible, right? Has anyone thought about the opposite scenario, when a node is removed from gluster?
Lubos, so this spec talks about only expansion scenario and shrink part is not in the scope of this.
@a2batic @julienlim @gnehapk @shtripat We've added a new wireframe to reflect what this should look like in the UI while the expansion is taking place. This approach can apply to expansion or any similar operation. See https://redhat.invisionapp.com/share/8QCOEVEY9#/282443505_Clusters-Update_In_Progress
In summary: *The status icon should change to 'pficon-in-progress'
Let me know if you have any questions.
@mbukatov @julienlim Is the view detail action(click on cluster name to view cluster detail) also allowed for expanding cluster? As per the mockup, the cluster name is hyperlinked.
@gnehapk @a2batic @shtripat @nthomas-redhat @r0h4n @Tendrl/qe @mcarrano
Design updates have been posted at https://redhat.invisionapp.com/share/8QCOEVEY9. See https://github.com/Tendrl/commons/issues/849#issuecomment-371000495 for further details.
@gnehapk During expansion, details are still viewable. For views where there may be out-of-synch data, we show an inline notification -- see https://redhat.invisionapp.com/share/8QCOEVEY9#/screens/283595310 example.
Signed-off-by: Shubhendu shtripat@redhat.com tendrl-bug-id: Tendrl/specifications#253