[RFE] More metadata fields

cloud-bulldozer / go-commons

Code repository with all go common packages and libraries

Apache License 2.0

4 stars 9 forks source link

[RFE] More metadata fields #47

Open rsevilla87 opened 4 months ago

rsevilla87 commented 4 months ago

Add more fields to metadata collection, the ones at https://github.com/cloud-bulldozer/e2e-benchmarking/blob/bf5ac71356e1f128f35cb231ad67e39729837345/utils/index.sh#L183C1-L190C53 can be potentially included

Im not sure about the actual benefits of indexing them along with the tests results though. wdyt @jtaleric @dry923 @afcollins , etc?

cc: @paigerube14 @vishnuchalla

jtaleric commented 4 months ago

IMHO having consistency across our metadata would be ideal.

Not sure if there is additional benefit adding the metadata across indexes.

vishnuchalla commented 4 months ago

IMHO having consistency across our metadata would be ideal.

Not sure if there is additional benefit adding the metadata across indexes.

+1. We should look for options to keep metadata minimal. Replicating it across indexes is not necessary.

afcollins commented 4 months ago

Im not sure about the actual benefits of indexing them along with the tests results though

If it is information we either are not retaining or is difficult to retrieve from what we do capture, then yes I see the value in capturing. Then we can filter out runs when we need to.

vishnuchalla commented 4 months ago

I think we have two things regarding metadata here

Adding more necessary metadata for the benchmark results itself.
Adding a CLI implementation to capture and publish just CI job related metadata to the perf_scale_ci ES index which e2e does as of today. So that we will have something easy to maintain going forward.

cc: @chentex

vishnuchalla commented 3 months ago

Appending slack thread link, that captures the overall idea.

jtaleric commented 3 months ago

Adding a CLI implementation to capture and publish just CI job related metadata to the perf_scale_ci ES index which e2e does as of today. So that we will have something easy to maintain going forward.

One thing to consider after more thought and exploration here... All of our CPT runs use perf_scale_ci index as a "aggregation index". However in our research testing we might not use CPT runs, so we would need a way to isolate runs w/o the "aggregation index", which could be the jobSummary data -- but we would just need to be consistent on where we place this job metadata across our tools.