Closed philknows closed 1 year ago
@dapplion When going through the metrics for analysis on the new beta, can you contribute to this checklist so in the future we can think about finding a way to automate? Similar to what you did for benchmarking?
From Apr 1 planning meeting, extra notes in regards to planning out testing infrastructure:
Dade: For testing infrastructure, issues don't usually show up unless we run for a couple of days, could affect velocity. How should we manage this?
Phil: We should define how an ideal testing infrastructure should look like. We want to put the work into a good process, but make sure it's accurately giving us metrics/data that helps find potential issues.
Phil: Do we need a controlled devnet environment to do testing in where we can influence parameters to simulate potential issues? Or is that too much work which doesn't reflect realities of public testnets and mainnet?
Cayman: Devnets are too small, things scale with validators and it's difficult to reproduce this on our own devnet. Should we focus on how a devnet would be beneficial for preliminary type tests such as sending/receive messages and not getting banned immediately, Beacon API endpoints, etc? Would this infrastructure be worth it? Will take additional servers and such.
Cayman: Maybe there's a way to set slots per second higher? Compress the amount of work? Change chain params to have it happen quicker?
Dade: The problem is not necessarily functional bugs, but rather performance regression. These are usually seen in a longer period of time. We should invest more in metrics monitoring. Should be some balance resources in setting up better metrics alongside testing infrastructure.
Phil: True. We should ensure the work we put into a testing infrastructure yields good data and results for it to be worth the work. If we do setup a testnet in a controlled environment, it should probably focus more on functional type testing to ensure our Beacon APIs are not broken for example. And push more of the testing infrastructure to help test against a more real-world environment like Prater. It should be focused to help relieve some grunt work (automation).
Closing for #4724
After the regression in performance of v0.34.0, it is clear that we will need a further, more comprehensive testing process for future releases which includes our various environments, setups and specific parameters we should ensure catches critical regressions in node performance. The end goal of this is to eventually automate this process in our testing environments and automatically detect regression in our defined metrics for passing beta release testing.
Our minimal testing environments include a combination of the following:
Hardware Resource Diversity Requirements:
Validator Set Requirements:
Environment Requirements:
A combination of each of these environments should be tested for no shorter than three days and compared with the previous stable version to analyze regression in any metrics. The following metrics checklist will determine criteria required to pass our release beta testing:
List is a draft and WIP.