Closed axw closed 4 years ago
#### Build stats * Build Cause: `[Pull request #183 opened]` * Start Time: 2020-07-30T05:44:09.738+0000 * Duration: 4 min 44 sec #### Test stats :test_tube: | Test | Results | | ------------ | :-----------------------------: | | Failed | 0 | | Passed | 9 | | Skipped | 0 | | Total | 9 |
Thinking it trough, I think I disagree with this change.
It is true we can get all that from Elasticsearch, it was actually considered initially. But given the (imo negligible) complexity of indexing those fields and storage costs, it pays off at query time.
A trivial _search
query would give me all the data I want without having to craft those calculations in a query body (accounting for optional values, zero divisions, etc).
Now if I need to dig beyond dashboards I need to write arbitrarily complicated queries instead.
I might not be the heaviest user of the hey-apm benchmarking, but I have barely used the derived metrices; if so I had to look up how they were calculated as it wasn't necessarily clear to me what the difference/timeunit etc for the metrics were.
@jalvz could you share where/how you were using these values?
Any time I queried the actual docs instead of using the dashboards. That is every time I want granular data or I am looking for something specific. Last time was during the recent spike caused by enabling the instrumentation
A trivial _search query would give me all the data I want without having to craft those calculations in a query body (accounting for optional values, zero divisions, etc). ... Any time I queried the actual docs instead of using the dashboards. That is every time I want granular data or I am looking for something specific. Last time was during the recent spike caused by enabling the instrumentation
I've been thinking of these metrics as only useful for comparisons, rather than looking at in isolation. Can you describe how you were using them in isolation? Or if you were using them for comparison, where the dashboards fell short and looking at raw docs helped.
Just trying to understand why we can't use dashboards + kuery bar.
how you were using them in isolation?
If I have a single run from a PR I can't just use the dashboards because results are averaged. So I'd query individual fields from an individual document (I grab a reportId
from the log to filter by) and then compare it with the latest average from the related dashboard.
why we can't use dashboards + kuery bar
I think we can do that the same now as before this change. The difference is that now we can't query individual documents (or rather we can, but it is quite more difficult)
I also don't understand what specific problem is this solving, besides doing the same thing in a different way (and with less lines of code, but lines of code are not a problem per se)
If I have a single run from a PR I can't just use the dashboards because results are averaged. ... I think we can do that the same now as before this change. The difference is that now we can't query individual documents (or rather we can, but it is quite more difficult)
We can't do it the exact same way (which implies no change at all), but we can do this with a data table visualisation and scripted fields (until Lens supports bucket_script - https://github.com/elastic/kibana/issues/67405). I've added one to the dashboard, let me know if this does what you need or what's missing.
I also don't understand what specific problem is this solving, besides doing the same thing in a different way (and with less lines of code, but lines of code are not a problem per se)
There were a couple of things that set this off:
func Div(i1, i2 interface{}) *float64
, and unnecessary code, e.g. func Sum(xs ...uint64) uint64
If I knew you were using the derived metrics at the individual report level (code comments explaining why we're not using ES aggs would have helped) then I wouldn't have gone and removed them; I would have just cleaned up the way they were calculated.
Anyway, as above, please let me know what's missing. We can add them back in if really needed.
we can do this with a data table visualisation and scripted fields
I think that will work - and if not we will figure something else out, thanks
We can derive them in Elasticsearch instead; delete the "numbers" package, use inline arithmetic.
Also, fixed a bug with the "total events sent - success %" metric reported to stdout, which mistakenly used an errors count instead of events count.