Open musamaanjum opened 2 months ago
@helen-fornazier I'm unable to assign this issue to you. Please have a look at what is causing the discrapency.
I see all these branches for stable-rt
what is missing?
But indeed, I wans't able to find node 66abc518e49a7366b292a076
in KCIDB for instance (which is present in the report you sent). @JenySadadia could you check please?
Also, shouldn't these node_timeouts be a MISS ?
about the MISS, I just noticed, these are build errors, we need this https://github.com/kernelci/kcidb-io/issues/82
But indeed, I wans't able to find node 66abc518e49a7366b292a076 in KCIDB for instance (which is present in the report you sent). @JenySadadia could you check please?
@helen-fornazier @JenySadadia This is my only concern at this time. The data should have been the same at both places.
But indeed, I wans't able to find node 66abc518e49a7366b292a076 in KCIDB for instance (which is present in the report you sent). @JenySadadia could you check please?
Yes, I am unable to find https://staging.kernelci.org:9000/viewer?node_id=66abc518e49a7366b292a076
on KCIDB dashboard. But it is present in the new grafana dashboard. Right?
If so, maestro did send the data and KCIDB dashboard is not showing it somehow.
Could you please check? @spbnick
I checked staging logs. Maestro didn't submit this entry. Then how did it reach to the new dashboard? Is that any other source submitting maestro data to it? @helen-fornazier
Let me clarify things:
about 66abc518e49a7366b292a076:
So the question is: why it is not in KCIDB ? , why maestro didn't submit it ? Why do we have this inconsistency? (cc @JenySadadia )
Hello @helen-fornazier @musamaanjum
I analyzed the staging logs and found the root cause.
From the logs, kcidb bridge service crashed on 08/01/2024 06:17:29 PM UTC
and restarted on 08/02/2024 12:16:55 AM UTC
.
The node https://staging.kernelci.org:9000/viewer?node_id=66abc518e49a7366b292a076 was updated at 2024-08-01 08:08:57 PM UTC
.
That's why we lost the updated
event from API as bridge service was not running at that time. Hence, KCIDB submission is missing for the node.
This issue has been partially taken care of by a patch that auto-restarts all the pipeline services after a crash. The patch has been merged and deployed on 2nd Aug.
I've checked stable-rt. There hasn't been any update for 8 days. Let's wait to see if we get correct and coherent results on Grafana on the next run.
@musamaanjum @helen-fornazier @JenySadadia can we close this task if it has been resolved?
Maintainers are already using Grafana dashboard. There was a report that preempt_rt config builds are missing (https://github.com/kernelci/kernelci-core/pull/2397#issuecomment-2272789692). I've investigated and found out that the builds data is visible in results obtained from result-summary.
https://grafana.kernelci.org/d/OKXc44EIz/home?orgId=1&var-origin=maestro&var-tree=stable-rt&var-branch=All&var-test_path_regex=%25&var-platform=%25&var-config=%25&var-datasource=cdmoe4lcafu2od
I'll attach the results file from result-summary below in the comments as it isn't attached here.
The discrepancies are as follows:
preempt_rt
jobs aren't present on Grafana andpreempt_rt
isn't present on config column. Probablypreempt_rt
jobs are missing.cc: @nuclearcat @padovan