Open apriofrost opened 6 years ago
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. We value your input and contribution. Please leave a comment if this issue still affects you.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. We value your input and contribution. Please leave a comment if this issue still affects you.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. We value your input and contribution. Please leave a comment if this issue still affects you.
The list includes missing documentation that would be helpful for Builder admins to monitor and troubleshoot Builder in production besides the On-call Engineering Duties Wiki and Troubleshooting Builder Service Wiki.
Docs to add:
_This issue is created based on the research findings from the Chef Builder admin interviews. You can find the complete research summary here (visible to Chef employee only)_
[ ] A general introduction to the role of each administrative tool: what information it provides, why it is useful for certain scenarios, and what actions you can take based on the information. See the research summary doc for a list of identified tools.
[ ] Direct URLs to the target web page in a given tool, such as a dashboard view. If there is no fixed URL, provide guidance on how to navigate through the UI to find the content.
[ ] Up-to-date CLI commands, directories and file names for SSH-ing into the nodes for logs, such as how to display certain lines of log files from the node.
[ ] Network issues: how to have visibility to the ZMQ communication between nodes and how to fix the issues.
[ ] A knowledge base outside the source code where all the error codes and messages can be searched to find the related content about what it means, what the cause could be and how to fix it. Below is an example error code and messages in the log: