Open juanjcsr opened 7 months ago
I'm experiencing the exact same issue with a latency slo
EDIT: Never mind, in my case I used an incorrect metric (also used a _bucket
metric for the total query)
EDIT2: Actually this is still an issue, I just experienced it again for another metric
I'm retrieving the exact same error in the browser console:
TypeError: Cannot read properties of undefined (reading 'push')
at aligneddata.tsx:121:19
at Array.forEach (<anonymous>)
at aligneddata.tsx:120:18
at Array.forEach (<anonymous>)
at aligneddata.tsx:117:15
at Pd (BurnrateGraph.tsx:115:7)
at Ea (react-dom.production.min.js:167:137)
at Ou (react-dom.production.min.js:197:258)
at Sl (react-dom.production.min.js:292:88)
at bs (react-dom.production.min.js:280:389)
aligneddata.tsx:121 Uncaught
TypeError: Cannot read properties of undefined (reading 'push')
at aligneddata.tsx:121:19
at Array.forEach (<anonymous>)
at aligneddata.tsx:120:18
at Array.forEach (<anonymous>)
at aligneddata.tsx:117:15
at Pd (BurnrateGraph.tsx:115:7)
at Ea (react-dom.production.min.js:167:137)
at Ou (react-dom.production.min.js:197:258)
at Sl (react-dom.production.min.js:292:88)
at bs (react-dom.production.min.js:280:389)
The SLO I used:
apiVersion: pyrra.dev/v1alpha1
kind: ServiceLevelObjective
metadata:
labels:
pyrra.dev/app: tempo
name: tempo-reads-errors-test
namespace: default
spec:
alerting:
absent: true
burnrates: true
description: Reading traces from Tempo API endpoints should answer queries 99% successfully
over 2w.
indicator:
ratio:
errors:
metric: tempo_request_duration_seconds_count{cluster=~"tempo", job=~"default/query-frontend",
route=~"api_.*", status_code=~"5.*"}
total:
metric: tempo_request_duration_seconds_count{cluster=~"tempo", job=~"default/query-frontend",
route=~"api_.*"}
target: "99.5"
window: 2w
@metalmatze I added a debug log and it looks like in my case timeValues
is larger than the pre-initialized values
array here: https://github.com/pyrra-dev/pyrra/blob/a5e3b4606daf843156111f791ff669d864163e7a/ui/src/components/graphs/aligneddata.tsx#L121
timeValues: 3 values: 1
Any idea how to fix this? The SLO is displayed correctly in Grafana
EDIT: Now about 45 minutes later I no longer get the error, I did not change anything. So it looks like an intermittent error that we should somehow catch
Hi, looks like I ran into this as well. In my troubleshooting it seemed related to an empty response for a burnrate metric query.
In my case pyrra was querying for
istio_request_duration_milliseconds:burnrate12d{destination_canonical_service=\"enwiki-articlequality-predictor-default\", kubernetes_namespace=\"istio-system\",response_code=~\"2..\",site=\"codfw\",slo=\"liftwing-articlequality-latency\"}
However this gave an empty response as the istio_request_duration_milliseconds:burnrate12d recording rule metric didn't have response_code
label
For the time being I'm working around it by inverting the slo definition like response_code!~"[345].."
instead of response_code=~"2.."
Although the response_code
label not making it through to the burnrate recording rule metric may be a deeper issue. At any rate, after updating the query the page renders again for me.
Hello Everyone
I've found a possible bug or issue when displaying some SLOs. When I navigate from the SLO list to the details page or when I check the details of a multiburn alert, I get a white screen:
The javascript error is the following:
The SLO that generates the error is the following:
I locally launched the UI (master branch) against my deployed pyrra server and encountered the same issue.
I'm running Pyrra v0.7.4
Do you have any idea why is this happening or how can I help to debug this issue?
Thank you, I really enjoy the work everyone is doing with Pyrra 😄