Closed serathius closed 6 months ago
cc @anishathalye
Created a draft integration into etcd robustness test, so we can see how it looks:
Etcd robustness test patch operation history before passing it to porcupine.
This is done to improve reliability of the test, by reducing chance of linearization hitting exponential complexity. We remove operations that don't impact linearization like failed read requests and failed writes that we know they were not persisted.
Adding those operations back in visualization would allow us to fill the gaps and make the picture complete. This makes the visualization more understandable.
ping @anishathalye
I will take a look this weekend. Quick feedback now, I would like the API to support adding "extra information" more broadly, not just operations (that go the code that converts operations to strings).
Sounds good, I was thinking about adding option to add a groups of operations with custom colors with legend and ability to hide a group.
In etcd robustness tests we have at least 3 types of operations. We have normal linearizable requests, failed linearizable reads, that we don't want to put because they just reduce performance without impacting linearization and serializable operations, requests about historical state that we check if it's available in history window, but don't validate the response contents. Would be nice to have each of them in different color to easily distinguish.
For failed reads it would be also nice to hide it, when testing the PR I found that there are long periods of all requests failing. This is because in etcd robustness tests we inject failpoint into etcd that can take the process down. During this time the API is not responsive, and we only send read requests (again performance to avoid linearizing too many concurrent failing writes) until API recovers. Would be nice to just have an option to hide failed requests.
Maybe one additional feature on my mind is show non-operational information. For etcd robustness we run the in 3 phases:
The most interesting part is usually the failpoint injection and the beginning of recovery. In 3 node cluster, failpoint injection doesn't necessarily goes down, so it's interesting to see if node going down causes invalid behavior to occur. And when the node recovers it's interesting to see whether it correctly rejoins the cluster. Would be nice to have those phases highlighted in the visualization.
I think https://github.com/anishathalye/porcupine/pull/18 addresses all of those needs, except the ability to hide annotations in the UI. Would appreciate any feedback you have on that PR!
Fix https://github.com/anishathalye/porcupine/issues/14
Don't have enough experience with library to properly design a interface, I'm open to suggestions.
Added only method for adding Operation and not Events. Reason is that events don't have notion of time, they infer time from the order, however for this use case we want to fill the gaps into linearization visualization, without changing it directly, so we need to use time as reference point.