Health check mock timeouts

ScottFreeCode commented 2 years ago

Depending on the developer machine (and not the project), sometimes mocks can take longer than their health check time to finish spinning up. It would be helpful to check for a user-level or system-level configuration that can override the health check timing for the mocks, either all mocks or specific ones.

shanejansen commented 2 years ago

This is an interesting problem. I wonder if the health check could be enhanced altogether to avoid this issue. Instead of depending on time, wait for some "healthy" keyword in the mock's startup output.

The default health-check timeout could probably just be increased a considerable amount in the meantime...

ScottFreeCode commented 2 years ago

A similar problem can apply to the services if a developer's machine is significantly slower than some teammates' machines or CI. They could check into the repo a num retries that is enough for the slowest machine. Or they can manually tweak it without checking it in.

But I wonder if some kind of mechanism could be introduced whereby a slower machine has global configuration that increases the Touchstone time allowed for startup of mocks or services on top of or proportional to the configured time. Then we solve both the built-in mocks problem and the services problem without having to tweak the project's normal timing for the slowest developer machine.

(Ideally do it by increasing number of tries rather than time between retries? I recommend that to team members so they avoid waiting extra seconds after it becomes healthy before the next retry detects said healthy state.)

ScottFreeCode commented 2 years ago

Note that something of this nature should not be necessary for processing period or other timing controlled by tests, as those could code machine settings through environment variables, choose the formula by which the time is derived (e.g. maybe a constant baseline is added to a multiplied time rather than using only a multiplier), etc. Letting the tests code their timing is more flexible and, I think, already supported.

This is only about container startup where we currently can't override the mock timeout or set a machine-dependent factor on the service timeout.

shanejansen / touchstone

Health check mock timeouts #39