Closed chenmoneygithub closed 1 year ago
Recently we are seeing a few timeout on accelerator testing, but checking the log, the tests are finished:
Step #5 - "create-job": keras_nlp/utils/tf_utils_test.py::TensorToStringListTest::test_session <- ../usr/local/lib/python3.8/dist-packages/tensorflow/python/framework/test_util.py SKIPPED (Not a test.) [100%] Step #5 - "create-job": Step #5 - "create-job": ================ 1035 passed, 214 skipped in 2780.92s (0:46:20) ================ Step #5 - "create-job": + sleep 5 Step #5 - "create-job": + gcloud artifacts docker images delete us-west1-docker.pkg.dev/keras-team-test/keras-nlp-test/keras-nlp-image:9480287a-708d-401e-8924-7ab6b420ddb4 Step #5 - "create-job": Digests: Step #5 - "create-job": - us-west1-docker.pkg.dev/keras-team-test/keras-nlp-test/keras-nlp-image@sha256:8f69c0a0cb78aca13c62ab87ca0e252bedecf4d46bb2ce122c4080a5342dcba9 Step #5 - "create-job": Step #5 - "create-job": Tags: Step #5 - "create-job": - us-west1-docker.pkg.dev/keras-team-test/keras-nlp-test/keras-nlp-image:9480287a-708d-401e-8924-7ab6b420ddb4 Step #5 - "create-job": Step #5 - "create-job": This operation will delete the above resources. Step #5 - "create-job": Step #5 - "create-job": Do you want to continue (Y/n)? Step #5 - "create-job": Delete request issued. Step #5 - "create-job": Waiting for operation [projects/keras-team-test/locations/us-west1/operations/6fcd169d-1748-4a89-9ac5-e27091ba41c5] to complete... TIMEOUT ERROR: context deadline exceeded
I will increase the deadline a bit to see if it helps, but let's keep this issue open for tracking.
Seems to solve our issue for now, but let's think about splitting the workload into nightly test.
Recently we are seeing a few timeout on accelerator testing, but checking the log, the tests are finished:
I will increase the deadline a bit to see if it helps, but let's keep this issue open for tracking.