Open rsevilla87 opened 1 year ago
With our move to prow, does it make sense to put this in scale-ci-deploy? Should we start abstracting these sort of day-2 operations in e2e-benchmarking or similar?
With our move to prow, does it make sense to put this in scale-ci-deploy? Should we start abstracting these sort of day-2 operations in e2e-benchmarking or similar?
that's right, not sure if e2e-benchmarking would be the right place though, as is a "benchmarking" repo and not a day-2 one.
Apart from that, this project still does lot of day-2 operations and it's has been updated recently (for example https://github.com/cloud-bulldozer/scale-ci-deploy/pull/213 and https://github.com/cloud-bulldozer/scale-ci-deploy/pull/211), if we're going to make that movement we should consider moving all the current day-2 operations performed here as well
Well #213 and #211 are really "hacks" to allow for us to test things like OVNIC or newer bits of OVN in our CI - not sure we want to conflate those steps w/ "Day 2 Operations".
that's right, not sure if e2e-benchmarking would be the right place though, as is a "benchmarking" repo and not a day-2 one.
Agreed... Not sure it is the right place.
Maybe we should consider a new repo for "Day 2" Operations?
Back in the Telco days, that's the path we went down.
We would deploy using JetSKi and then have all the day-2 config (operators, logging stack setup, BigIP config) all encapsulated in https://github.com/redhat-performance/webfuse
Pyroscope is an interesting tool that can be very useful when reporting/troubleshooting a low level performance issue. Deploying and configuring it as a 2nd day-op once the cluster is ready would be great.
For the moment, I'd configure pyroscope to scrape these components, which already expose the /pprof endpoints by default:
We can also consider scraping other core components like (optionally ? ):
Prior start developing this RFE, we should investigate how much space requires pyroscope, and determin the overhead (if relvant) it adds to these components