-
When the `collect-job-metrics` flag is set to true, metrics at the job level as well as step level are sent to datadog. This doesn't work well if a workflow has a lot of jobs and steps. For a large wo…
-
### Is there an existing issue for this?
- [X] I have searched the existing issues
### YACE version
0.51.0
### Config file
apiVersion: v1alpha1
sts-region: ap-east-1
discovery:
jobs:
- …
-
### What did you do?
Prometheus automatically discovers the metrics interfaces exposed by microservices based on the __meta_kubernetes_pod_annotation_prometheus_io_scrape label in the microservice co…
-
Currently Gitlab has a way for you to export metrics https://docs.gitlab.com/ee/ci/metrics_reports.html
The idea is to scrape these metrics artifact files from defined jobs in a repo and export the…
-
I follow guide: [arena/docs/userguide/9-top-job-gpu-metric.md](https://github.com/kubeflow/arena/blob/master/docs/userguide/9-top-job-gpu-metric.md).
everything works as expect until last one, when…
-
### Description
The renovate on-prem application only exposes a `/status` endpoint that can be used to see what is the current state of the application, however, it is difficult to parse and send t…
-
**Describe the bug**
I don't know if this is the correct place or this, if it's not please advise where to direct this issue.
tldr; ama-metrics-operator-targets seems to have a memory leak (I assume…
-
**Describe the bug**
Job label selectors used in dashboards from loki mixin (https://github.com/grafana/loki/blob/main/production/loki-mixin/dashboards/loki-writes.libsonnet#L21) don't match the …
-
#### Problem
Recently after some larger algorithm updates we've been running archival jobs and other jobs that almost make it to completion but are then killed off and have to be restarted. Some of t…
-
I've been getting some errors when computing/loading extensions from a `SortingAnalyzer` containig a `template_metrics` extension. I calculate extensions from a dict as follows:
```py
phy_exts = {…