Open jochenehret opened 3 months ago
PR was merged 5 days ago. So far no failures. Let's observe a few more days before we close this issue.
Failing again, multiple times in a row: https://concourse.wg-ard.ci.cloudfoundry.org/teams/main/pipelines/cf-deployment/jobs/fips-cats/builds/162 https://concourse.wg-ard.ci.cloudfoundry.org/teams/main/pipelines/cf-deployment/jobs/fips-cats/builds/163 https://concourse.wg-ard.ci.cloudfoundry.org/teams/main/pipelines/cf-deployment/jobs/fips-cats/builds/164
Looks like our connection handling is not working correctly...
The TCP Routing test that checks if one app can be reached from two ports is failing often here: https://github.com/cloudfoundry/cf-acceptance-tests/blob/6f060209f7a55f0c4f8d0fffabb122c785ce914e/tcp_routing/tcp_routing.go#L131
Example failures: https://concourse.wg-ard.ci.cloudfoundry.org/teams/main/pipelines/cf-deployment/jobs/fips-cats/builds/82 https://concourse.wg-ard.ci.cloudfoundry.org/teams/main/pipelines/cf-deployment/jobs/fips-cats/builds/57 https://concourse.wg-ard.ci.cloudfoundry.org/teams/main/pipelines/cf-deployment/jobs/fips-cats/builds/113 https://concourse.wg-ard.ci.cloudfoundry.org/teams/main/pipelines/cf-deployment/jobs/fips-cats/builds/120
I've recreated the test setup manually on fips/snape. The setup works as expected: You can send data over two different TCP ports to the test app and the app responds as expected. Running the test in the CATs suite however fails often.
I've added some debug statements with timestamps. Here's the flow from a failed run:
When the second message is sent, the
conn.Write(message)
statement returns no error: https://github.com/cloudfoundry/cf-acceptance-tests/blob/6f060209f7a55f0c4f8d0fffabb122c785ce914e/cats_suite_helpers/cats_suite_helpers.go#L417 However, the test app doesn't seem to receive the message. There is no "Message to" log statement. What happens next is an error at theconn.Read(buff)
statement: https://github.com/cloudfoundry/cf-acceptance-tests/blob/6f060209f7a55f0c4f8d0fffabb122c785ce914e/cats_suite_helpers/cats_suite_helpers.go#L429 Error is "EOF" and the buffer is empty.Looks like a race condition. The
Read
function is probably called before the test app starts to write and fails immediately with EOF?