cekit / behave-test-steps

MIT License
0 stars 21 forks source link

"When container is started with args" waits for a java process even when not calling java #48

Open jmtd opened 8 months ago

jmtd commented 8 months ago

We use the step "When container is started with args" in quite a few tests, e.g. all of these ones which run a variety of binaries: ls,rpm,bash, but notably not java.

That step is defined here https://github.com/cekit/behave-test-steps/blob/v1/steps/container_steps.py#L86 Note that function has a parameter pname defaulting to java, but nothing alters that value: it's not parameterized in the @given or @when variants that select it.

start_container_with_args calls wait_for_process(context, pname) (https://github.com/cekit/behave-test-steps/blob/v1/steps/container_steps.py#L167 , always with pname=java) which does a busy-loop poll of the process table (via invoking ps) with a hard-coded 10 second timeout.

Thus, any test using the step "When container is started with args" will have an up-to 10 second delay added to its runtime looking for a java process before completing.

(I've discovered this while trying to figure out why our GHA tests runs all time out and fail. The last thing they log as running is one of the wait_for_process polling loops, although that might be a coincidence. The tests stick in that state for 5-6h before GHA kills them.)

rnc commented 8 months ago

I see what you mean. https://github.com/cekit/behave-test-steps/blob/v1/steps/container_steps.py#L109 seems to be similar. However, https://github.com/cekit/behave-test-steps/blob/v1/steps/container_steps.py#L99 and https://github.com/cekit/behave-test-steps/blob/v1/steps/container_steps.py#L157 appears to pass through entrypoint/pname?

I guess there could be backwards compatibility issues if you changed some of the functions? Perhaps raise this on the CEKit channel to see if people would be willing to change?

jmtd commented 8 months ago

However, https://github.com/cekit/behave-test-steps/blob/v1/steps/container_steps.py#L99 and https://github.com/cekit/behave-test-steps/blob/v1/steps/container_steps.py#L157 appears to pass through entrypoint/pname?

Yes they do.

I guess there could be backwards compatibility issues if you changed some of the functions? Perhaps raise this on the CEKit channel to see if people would be willing to change?

Absolutely, yes. This points to a wider problem with behave-test-steps versioning, really: everyone is using the same branch, v1, which is tied to the schema version in cekit (which seems to be 1 for all production releases. I think we toyed with version 2 in the past in development branches).

Since everyone is using the same branch, and (until now) there's been no easy way to vary that, people have been very conservative when adding steps. Whereas, with a coordinated change, it might have sometimes made sense to adjust or reword an existing step, we've always just added a new one, so as to not potentially break anyone else.

Combined with there being no official docs on what steps there are, besides reading the source code, we've just accreted more and more steps over time.

I think it would be great to have a discussion about how this might be managed differently. I think it's also worth revisiting if the steps remain external to cekit itself: if they were internal, that'd be one less git clone per test run across all cekit users by default, and we could manage changes to the steps provided via cekit's existing versioning/release scheme.

But even if we didn't got that far: I'd like to whiteboard/brainstorm ideas around publishing some kind of "api doc" for steps, some kind of recommendations for which steps to use, a method for deprecating others, and similar.