Chapter 3 - I understand that saturation is important, but not how to measure it.

MalcolmAnderson commented 2 years ago

This may be something for the second edition, or I may be an idiot - the latter is often true.

tl;dr - After reading chapter 3, I don't feel confident in my ability to find and measure resource saturation.

Including the index, there are 23 instances of the word "saturation", and 1 instance of "saturated" It feels like a more important concept than 23 instances indicates.
Maybe I'm fixated on it because it's a new concept for me. From a lean standpoint, wait queues are a measure waste, so I get the importance.

I believe I understand the concept, you phrased it very clearly at the beginning of 3.2: "Finally, saturation is the amount of work that the resource can’t service at any given moment (often queued)."

In 3.3, running "iostat -x" I get that the "aqu-sz" is a measure of saturation - it's a measure of what the average wait time in the queue is.

In 3.3.4 (is this line) it seems to say that saturation is something inferred, rather than concretely measured. The number of runnable processes gives an indication of the saturation of the system (the more processes competing for runtime, the busier the system—remember the load averages earlier in the chapter?).

What I didn't see during my time executing the code is an example of what high saturation looks like, as distinct from utilization.

I get that when we used up all the memory, that we had high utilization, but I don't have a mental grip on what high, (or rising) saturation would look like, and how to spot it.

I guess I do get it, but not how to find it and measure it (except for iostat -x) As a self check, here's my understanding of the difference between utilization and saturation using a nightclub. Utilization is how full is the club, if the occupancy is 100 people, and there's 85 people in the club, it's at 85% capacity. Saturation is a measure of how big the line out the front is.

But I don't know how to measure that. Is it in average wait time? Is it number of people in the queue?

Or is that the point? Do I find a clue, and watch the motion of the numbers and try to figure out what the impact is?

More importantly, I don't know how to cause high saturation of a resource in an experiment.

Either way, I've been introduced to a new distinction, and I'm grappling with it, and that gives me new tools to play with, which makes me happy.

seeker89 commented 2 years ago

HI @MalcolmAnderson thanks for the issue!

Saturation is a little harder to visualise, but it's about the extra work that the device can't do.

I like your example of a night club, except that people tend to go and stay there for a little while. A better one might be a fast food joint, that's next door from a night club, benefiting from people craving fried chicken at 2 am.

The utilization could be the average time that the food joint was running at full speed, producing food. The saturation would be the length of the queue of people wanting to place an order.

MalcolmAnderson commented 2 years ago

Seems like a fuzzy concept that is inherently difficult to measure. When talking with a VP what units do you use to quantify the saturation?

seeker89 / chaos-engineering-book

Chapter 3 - I understand that saturation is important, but not how to measure it. #29