AMDResearch / omnistat

https://amdresearch.github.io/omnistat/
MIT License
5 stars 0 forks source link

Test multiple nodes and usermode #84

Closed jordap closed 2 months ago

jordap commented 2 months ago

Improvements to the Docker Compose environment for testing.

Tasks:

jordap commented 2 months ago

Basic usermode testing is working. It doesn't cover all cases, but it can help us identify if omnistat-usermode is not working. It doesn't cover omnistat-query yet because that won't work on Github, but we can potentially have a test for omnistat-query that only runs locally in machines with GPUs.

While working on testing I noticed some issues that I want to review. For example, Prometheus takes a surprisingly long time to start scraping data from Omnistat. After Prometheus and the Omnistat monitor are up and responding to requests, it takes approximately 5 seconds for Prometheus to start scraping values.

I also think we need a way to (optionally) store the output of the exporters for easier debugging.