Closed Kailai-Wang closed 2 years ago
Parachain Grafana dashboard
Parachain Grafana
Thanks @niteshb3495 , this is the information for the machine (cpu/ram/network), right?
Can we have the chain information too? Similar to the panel of parachain-staging-0
(I think you can copy the JSON model but just modify the IP):
Litentry-parachain-prod-0,1,2 is updated. cpu, disk, memory alter rules are also included.
NOTE:
Not found metrics about litmus-parachain.
http://46.4.120.73:9090/graph?g0.range_input=1h&g0.expr=node_cpu_seconds_total&g0.tab=1
litmus 's node (54.254.120.238
) metrics isn't existed.
TODO
node-exporter
metrics in litmus nodessubstrate-exporter
metrics in litmus nodesIt seems node exporter on litmus was blocked.
@Kailai-Wang @StamfordDigital I checked server litmus-parachain-sg-0
, and didn't find node-exporter
on that server. Is there any particular reason not to install it?
Since node_exporter
isn't installed, grafana dashboard has no information about disk, CPU, memory usage, etc.
But I think, by using cloudwatch, we can also collect related information.
I might prefer node-explorer
simply because it's more generic -- imagine we want to move some servers out of AWS, but Cloudwatch
is easier to set up due to the graphic UI I think.
@niteshb3495 what's your thought on this please?
I agree, having a node_exporter is a good idea. Just to let you guys know, I have added some of the nodes to Grafana here. As I proposed in the devops channel that we should take care of the network security features, keeping that in mind, I installed the prometheus on the jumphost and configured the datasource as Prometheus-AWS in Grafana. @chenzongxiong - I request you to please use the same prometheus for the further monitoring to reduce the future work. On the jumphost prometheus is under /opt/
I have configured a Jenkins job automation for the prometheus config file changes and node_exporter installation. Which will be a single click job.
Please have a look at the prometheus config file here
To make any changes in the prometheus, we can follow below steps.
Note: As I was testing everything, I have not merged nitesh branch to master.
@Kailai-Wang
I have no permission to access litmus nodes, can you install node exporter
on them when you're available.
I'm a bit curious. In my previous scripts, related apps can be also installed. Not very clear about the current setup procedure. Once node-exporter
is installed, then this issue can be closed.
@niteshb3495 As for migrating everything to jumphost (AWS), I think need arrange a new epic for such an issue.
And I think for migration, jumphost and promethues server should also be separated.
I wanna what's the usage of Jenkins here? Where not use rundeck and ansible? As far as I know, ansible can achieve the same functionalities as jenkins'.
And I think for migration, jumphost and promethues server should also be separated.
Having them on a separate servers will work without any issue. But I cannot see any concrete reason to do so. Jumphost has separate Linux users created without sudo access. No one will be able to access or do anything to Prometheus or Jenkins as they don't have permissions. Moreover, the Prometheus and Jenkins ports are only open for the required access and not to the world. No one from outer world will be able to access the services.
I wanna what's the usage of Jenkins here? Where not use rundeck and ansible? Jenkins Benefits:
- Main reason: As Jenkins is installed in the private network, we can easily connect any server without using the public IP (using public IP means we are exposing the server to the outer world). Rundesk is not in our private network, hence we need server's public IP to connect from rundesk.
- We can integrate Jenkins with any other infrastructure automation. for example, creating any resource in AWS, taking database backup.
- Jenkins is plugin based, we can have a tons of automation with Jenkins by installing required plugin.
- Whatever credentials we will be using in Jenkins will be stored securely with us only. No need to use any third party tool to store any credentials.
- Jenkins can integrate with Ansible or almost with any other tool.
- Example - In MCP, there is a dependency on me for developers to trigger a database backup. I am going to create a Jenkins job for the developers to trigger the database backup whenever they want, without my help. This will secure the server access too as they don't need Jumphost login to trigger the backup script. Also, I can restrict any developer to access only mentioned job and not even see other jobs. This is additional security.
As far as I know, ansible can achieve the same functionalities as jenkins. No. We can use Ansible to work as Jenkins or vise versa. But the functionality is complete different.
- Jenkins is one click task, can be triggered from mobile as we have UI for it. Cannot do the same in Ansible.
- Jenkins can run any playbook in Ansible.
- Jenkins jobs can be scheduled.
- Jenkins can trigger notification on slack or email or any other tool.
- We can add webhooks in Jenkins.
- and many more
Installed the node-exporter on the litmus prod nodes. @chenzongxiong - FYI, closing the ticket.
Context
Similar to litmus, we need to set up a basic monitor and alert system for litentry-parachain on grafana
Task
:heavy_check_mark: Please set appropriate labels and assignees if applicable.