aws-samples / aws-parallelcluster-monitoring

Monitoring Dashboard for AWS ParallelCluster
MIT No Attribution
31 stars 23 forks source link

Cannot see dashboards #9

Closed afernandezody closed 3 years ago

afernandezody commented 3 years ago

Hello, When I log in to Grafana, the dashboards are not there (it feels like it has the data but doesn't create the dashboards with them). Checking the master and compute nodes show the following running containers:

CONTAINER ID   IMAGE                              COMMAND                  CREATED          STATUS          PORTS     NAMES
9594c6bde24f   nginx                              "/docker-entrypoint.…"   21 minutes ago   Up 21 minutes             nginx
b427735a4851   prom/pushgateway                   "/bin/pushgateway"       21 minutes ago   Up 21 minutes             pushgateway
42b4c6967e5a   prom/prometheus                    "/bin/prometheus --c…"   21 minutes ago   Up 21 minutes             prometheus
9c18bbc6826e   quay.io/prometheus/node-exporter   "/bin/node_exporter …"   21 minutes ago   Up 21 minutes             node-exporter
3516f5f1c85a   grafana/grafana                    "/run.sh"                21 minutes ago   Up 21 minutes             grafana

Compute
701c27e08b7b   quay.io/prometheus/node-exporter   "/bin/node_exporter …"   15 minutes ago   Up 15 minutes             node-exporter

I was wondering if anything is missing so I can rule it out as a potential cause. Thanks.

nicolaven commented 3 years ago

Hi @afernandezody,

I am not sure what you changed in the code, so not sure what the root cause could be. If it is ok for you, we can set up a meeting so I can have a closer look at it.

Thanks

afernandezody commented 3 years ago

Hi @nicolaven, The main change is how Docker is installed (for CentOS) and a minor streamlining at the end of the script as I'm not using GPUs. I'm in the US Eastern time but very flexible with my schedule, which times would work for you? Thanks.

afernandezody commented 3 years ago

The dashboards seem to be working now (I didn't change anything except for the web browser). The only element that is not working is cost, it works for alinux2 but it comes at 0 (for every charge) with CentOS8.

nicolaven commented 3 years ago

Good progress! yes, again this has been tested with AL2 only. feel free to modify the code for CentOS8.

Thanks

afernandezody commented 3 years ago

Haven't had time to check the custom metric shells. Crontab has scheduled them but maybe they're not working for some reason.

WajdiH commented 3 years ago

Hi, I have the same issue, What should I do to for the cost dashboard to be working on centos7.

afernandezody commented 3 years ago

I was hoping to have some time this morning to get back to this development. @WajdiH - The cost dashboard worked for me a couple of times whereas it was blank on many others. Not sure why but you can try using a larger master instance and see if that works. Otherwise, I will update the thread if I figure out why of the issue.

WajdiH commented 3 years ago

Thank you @afernandezody for your reply. I am using t3.small for the master instance , what type of ec2 instance you think I should try with.

WajdiH commented 3 years ago

Also, Is there any changes that I should do to the docker installation ?

afernandezody commented 3 years ago

@WajdiH, Never mind about what I said earlier. 1 - If you can log in on the Grafana dashboards (even if costs come blank), it means that the Docker installation was successful and you don't need to touch it. 2 - The problem with the costs is (at least for CentOS8) that it needs to invoke python3 (AL2 seems yo work with python). This requires a slight modification to the custom-metrics shell scripts. [Note: maybe alias could do the trick but it wouldn't be a particularly elegant solution]. @nicolaven. Maybe you want to chime in here before even starting any PR.
P.S. Forgot to mention that it needs to install botocore.