grpc / grpc-ios

gRPC for iOS
Apache License 2.0
39 stars 22 forks source link

ios cronet test suite crash due to port server not running #44

Open dennycd opened 2 years ago

dennycd commented 2 years ago

observing high frequent cronet test time out on both pre-submit and prod/master's grpc_basictests_objc_ios test suite. One of the following sample log captures the issue from cronet test suite crash due to python port server not running. XCTest attempts to restart the test and eventually leads to timeout being reported

Mon Nov  1 11:37:54 PDT 2021 - Test Case '-[CoreCronetEnd2EndTests testCancelAfterClientDone]' started.
Mon Nov  1 11:38:04 PDT 2021 - gRPC tests require a helper port server to allocate ports used
Mon Nov  1 11:38:04 PDT 2021 - during the test.
Mon Nov  1 11:38:04 PDT 2021 -
Mon Nov  1 11:38:04 PDT 2021 - This server is not currently running.
Mon Nov  1 11:38:04 PDT 2021 -
Mon Nov  1 11:38:04 PDT 2021 - To start it, run tools/run_tests/start_port_server.py
Mon Nov  1 11:38:04 PDT 2021 -
Mon Nov  1 11:38:25 PDT 2021 -
Mon Nov  1 11:38:25 PDT 2021 - Restarting after unexpected exit or crash in CoreCronetEnd2EndTests/testCancelAfterClientDone; summary will include totals from previous launches.
Mon Nov  1 11:38:25 PDT 2021 -

most sample test log doesn't capture the above crash, but instead reporting time out with the last test suite being ios cronet test,

2021-11-01 14:50:10,768 START: ios-test-cronettests

2021-11-01 15:16:45,843 TIMEOUT: run_tests_objc_macos_opt_native [pid=3790, time=5402.6sec]
2021-11-01 15:16:45,856 FAILED: Some run_tests.py instances have failed.

This is followed by another script check agains the port server existence w/ exit failure

+ FAILED=true
+ ps aux
+ grep 'port_server\.py'
+ awk '{print $2}'
+ xargs kill -9
...
real    0m13.275s
user    0m0.359s
sys 0m8.198s
+ '[' true '!=' '' ']'
+ exit 1
jtattermusch commented 2 years ago

It might be a red herring, but note that the objc_ios tests started to fail just after https://github.com/grpc/grpc/pull/27740 was merged - so there is a high chance that this PR has caused the problem.

dennycd commented 2 years ago

test reverting https://github.com/grpc/grpc/pull/27897

dennycd commented 2 years ago

27897 revert looks promising so far (https://fusion.corp.google.com/projectanalysis/summary/KOKORO/prod:grpc%2Fcore%2Fpull_request%2Fmacos%2Fgrpc_basictests_objc_ios) though I am still investigating the root cause for this