nightwatchjs / nightwatch

Integrated end-to-end testing framework written in Node.js and using W3C Webdriver API. Developed at @browserstack
https://nightwatchjs.org
MIT License
11.8k stars 1.31k forks source link

502 Bad Gateway response should be retried #2458

Closed Ftelf closed 4 years ago

Ftelf commented 4 years ago

Using selenium cloud services like BrowserStack etc.. renders in frequent 502 Bad Gateway responses when their service is not quite reponsible. We have like 20 false fails per week just due to this problem. This error is consumed in the NWJS logger and visible only when --verbose is used.

It looks like follows in the verbose log: &1|10:22:46.047 tests/BVT/PG-1232.2 INFO Request: POST http://hub-euw.browserstack.com/wd/hub/session/90f39c9ed679617971d2bbc01701e10588e68c89/element/303/click &1|10:22:46.047 - data: &1|10:22:46.047 - headers: {"Content-Length":0} &1|10:22:46.047 tests/BVT/PG-1232.2 Error processing the server response: &1|10:22:46.047 <html> &1|10:22:46.047 <head><title>502 Bad Gateway</title></head> &1|10:22:46.047 <body bgcolor="white"> &1|10:22:46.047 <center><h1>502 Bad Gateway</h1></center> &1|10:22:46.047 <hr><center>nginx</center> &1|10:22:46.047 </body> &1|10:22:46.047 </html> &1|10:22:46.047 tests/BVT/PG-1232.2 ERROR Response 502 POST http://hub-euw.browserstack.com/wd/hub/session/90f39c9ed679617971d2bbc01701e10588e68c89/element/303/click (286ms) &1|10:22:46.047 tests/BVT/PG-1232.2 { value: -1, error: 'Unexpected token < in JSON at position 0' } &1|10:22:46.047 tests/BVT/PG-1232.2 LOG → Completed command click &1|10:22:46.047 tests/BVT/PG-1232.2 (457 ms) &1|10:22:46.047 tests/BVT/PG-1232.2 LOG → Completed command useCss &1|10:22:46.047 tests/BVT/PG-1232.2 (0 ms) &1|10:22:46.047 tests/BVT/PG-1232.2 LOG → Completed command useRecursion &1|10:22:46.047 tests/BVT/PG-1232.2 (0 ms)

In this particular example the click selenium command fails on nginx. NWJS wrongly parsed the response and continue with execution. The click should have open drop-down menu which was not opened and the test fails after timeout not founding the list.

NWJS should recognize 502 Bad Gateway the same as timeout etc... and use this retry option: request_timeout_options: { retry_attempts: 3 }, also for such responses.

beatfactor commented 4 years ago

Have you contacted their support as well? This isn't a timeout error on their side, it looks more like the service is unavailable, so I'm not sure if a retry will help.

Ftelf commented 4 years ago

Hi, thanks for reply. I have contacted their support but the fact is that active request refusal from load-balancer is quite common these days. Even our cloud app which we develop does use this. In one e2e tests verification there are 127 707 selenium command calls and in about 5 run there are one or more 502 Bad Gateway responses and we have found that only when --verbose is in use. The verbose log has 100MB so we want to turn that off asap again. If retry is not an option we would even not mind of NWJS would just throw an ERROR to the console but in normal mode, not only in --verbose. Without verbose the response just fails to be parsed by JSON and NWJS silently w/o any notification continue to execute other commands. It was extremely flustrating and time-consuming to find the true source of the random failures.

beatfactor commented 4 years ago

Yes, definitely the 502 responses should be treated as errors, with or without the retry in place.

beatfactor commented 4 years ago

Also, you do have the error logged, without verbose, (see "Error processing the server response: ...") but it's probably not reported as an error. Can you share a sample test which produces this error?

Ftelf commented 4 years ago

The result of the test/suite depends on which kind of command is under processing. if it is waitForElementNotPresent the problem is ignored since NWJS treats the bad response as a proof that element is not present which is definitely not desirable effect. The test most probable fails few commands later since something is not done before. But the debugging is not trivial. This is how it could look: ` tests/BVT/PG-1249.2 ✔ Expected element Section[name=topBar],Section[name=secondaryActions],Element[name=@save] to not be enabled - condition was met in 1183ms tests/BVT/PG-1249.2 ✔ Element Section[name=topBar],Section[name=primaryActions],Element[name=@build] was visible after 1407 milliseconds. tests/BVT/PG-1249.2 ✔ Expected element Section[name=topBar],Section[name=primaryActions],Element[name=@build] to be enabled - condition was met in 1455ms tests/BVT/PG-1249.2 Error processing the server response:

502 Bad Gateway

502 Bad Gateway


nginx

tests/BVT/PG-1249.2 ERROR: Unable to locate element: "Section[name=topBar],Section[name=primaryActions],Element[name=@build]" using: recursion tests/BVT/PG-1249.2 at Page.waitForAndClick [as buildModel] (/godata/pipelines/DEV-deploy-price-optimization/vendavo-frontend/vendavo-price-optimization/e2e/page/model.js:398:20) at Object.buildModel [as Add new draft model] (/godata/pipelines/DEV-deploy-price-optimization/vendavo-frontend/vendavo-price-optimization/e2e/tests/BVT/PG-1249.2.js:56:8) tests/BVT/PG-1249.2 ✖ Waiting for dialog to open... - expected "visible" but got: "not found" tests/BVT/PG-1249.2 at Page.waitForVisibleWithHeading [as buildModel] (/godata/pipelines/DEV-deploy-price-optimization/vendavo-frontend/vendavo-price-optimization/e2e/page/model.js:401:8) at Object.buildModel [as Add new draft model] (/godata/pipelines/DEV-deploy-price-optimization/vendavo-frontend/vendavo-price-optimization/e2e/tests/BVT/PG-1249.2.js:56:8) tests/BVT/PG-1249.2 FAILED: 1 assertions failed, 1 errors and 24 passed (5m 42s / 342170ms) `

Sorry, I can't find any log with that issue when it just passed OK and with verbose not present. It would requere me to go through thousands of logs on gocd. Maybe later, but there is one with verbose. The same as in the first post but expanded. The result is that click request fails, there is no error thrown, just that <><>502 Bad Gateway<><> in the middle of the log and the test just continues trying to find the element which should appear after the click and then fails on timeout.

&1|10:25:09.200 tests/BVT/PG-1235.3 INFO Request: POST http://hub-euw.browserstack.com/wd/hub/session/e21ca6ebfdbc61a0cbe0b2505a20f5fc673086c1/element/169/element &1|10:25:09.200 tests/BVT/PG-1235.3 - data: {"using":"xpath","value":".//*[contains(@class, \"Field_container\") and ./*[contains(@class, \"Field_label\")]/label[.=\"Field\"]]/*[contains(@class, \"Field_controls\")]//*[contains(concat(' ', normalize-space(@class), ' '), ' select2-container ') and not(contains(concat(' ', normalize-space(@class), ' '), ' select2-container-disabled '))]//*[contains(@class, \"select2-chosen\")]"} &1|10:25:09.200 - headers: {"Content-Type":"application/json; charset=utf-8","Content-Length":386} &1|10:25:09.200 tests/BVT/PG-1235.3 Error processing the server response: &1|10:25:09.200 <html> &1|10:25:09.200 <head><title>502 Bad Gateway</title></head> &1|10:25:09.200 <body bgcolor="white"> &1|10:25:09.200 <center><h1>502 Bad Gateway</h1></center> &1|10:25:09.200 <hr><center>nginx</center> &1|10:25:09.200 </body> &1|10:25:09.200 </html> &1|10:25:09.200 tests/BVT/PG-1235.3 ERROR Response 502 POST http://hub-euw.browserstack.com/wd/hub/session/e21ca6ebfdbc61a0cbe0b2505a20f5fc673086c1/element/169/element (194ms) &1|10:25:09.200 tests/BVT/PG-1235.3 { value: -1, error: 'Unexpected token < in JSON at position 0' } &1|10:25:09.200 tests/BVT/PG-1235.3 ERROR: Unable to locate element: "Section[name=filters],Section[name=transactionFilters],Element[name=@__.//*[contains(@class, "Field_container") and ./*[contains(@class, "Field_label")]/label[.="Field"]]/*[contains(@class, "Field_controls")]//*[contains(concat(' ', normalize-space(@class), ' '), ' select2-container ') and not(contains(concat(' ', normalize-space(@class), ' '), ' select2-container-disabled '))]//*[contains(@class, "select2-chosen")]]" using: recursion &1|10:25:09.200 tests/BVT/PG-1235.3 at Section.waitForAndClick [as setDropDown] (/godata/pipelines/DEV-deploy-price-optimization/vendavo-frontend/vendavo-price-optimization/e2e/page/defaultSettings2.js:640:20) &1|10:25:09.200 at Section.setDropDown [as setField] (/godata/pipelines/DEV-deploy-price-optimization/vendavo-frontend/vendavo-price-optimization/e2e/page/defaultSettings2.js:648:27) &1|10:25:09.200 at Object.setField [as Set Unit Margin guidance scenario parameters] (/godata/pipelines/DEV-deploy-price-optimization/vendavo-frontend/vendavo-price-optimization/e2e/tests/BVT/PG-1235.3.js:121:8) &1|10:25:09.200 tests/BVT/PG-1235.3 LOG → Completed command click &1|10:25:09.200 tests/BVT/PG-1235.3 (363 ms) &1|10:25:09.200 tests/BVT/PG-1235.3 LOG → Completed command useCss &1|10:25:09.200 tests/BVT/PG-1235.3 (0 ms) &1|10:25:09.200 tests/BVT/PG-1235.3 LOG → Completed command useRecursion (0 ms) &1|10:25:09.200 tests/BVT/PG-1235.3 INFO Request: POST http://hub-euw.browserstack.com/wd/hub/session/e21ca6ebfdbc61a0cbe0b2505a20f5fc673086c1/elements &1|10:25:09.200 tests/BVT/PG-1235.3 - data: {"using":"css selector","value":"#select2-drop"} &1|10:25:09.200 - headers: {"Content-Type":"application/json; charset=utf-8","Content-Length":48} &1|10:25:09.200 tests/BVT/PG-1235.3 INFO Response 200 POST http://hub-euw.browserstack.com/wd/hub/session/e21ca6ebfdbc61a0cbe0b2505a20f5fc673086c1/elements (88ms) { sessionId: 'e21ca6ebfdbc61a0cbe0b2505a20f5fc673086c1', &1|10:25:09.200 state: 'success', &1|10:25:09.200 hCode: 3506223, &1|10:25:09.200 value: [], &1|10:25:09.200 class: 'org.openqa.selenium.remote.Response', &1|10:25:09.200 status: 0 } &1|10:25:09.200 tests/BVT/PG-1235.3 INFO Request: POST http://hub-euw.browserstack.com/wd/hub/session/e21ca6ebfdbc61a0cbe0b2505a20f5fc673086c1/elements &1|10:25:09.200 tests/BVT/PG-1235.3 - data: {"using":"css selector","value":"#select2-drop"} &1|10:25:09.200 - headers: {"Content-Type":"application/json; charset=utf-8","Content-Length":48} &1|10:25:09.200 tests/BVT/PG-1235.3 INFO Response 200 POST http://hub-euw.browserstack.com/wd/hub/session/e21ca6ebfdbc61a0cbe0b2505a20f5fc673086c1/elements (179ms) &1|10:25:09.200 tests/BVT/PG-1235.3 { sessionId: 'e21ca6ebfdbc61a0cbe0b2505a20f5fc673086c1', &1|10:25:09.200 state: 'success', &1|10:25:09.200 hCode: 1500203, &1|10:25:09.200 value: [], &1|10:25:09.200 class: 'org.openqa.selenium.remote.Response', &1|10:25:09.200 status: 0 } &1|10:25:09.200 tests/BVT/PG-1235.3 INFO Request: POST http://hub-euw.browserstack.com/wd/hub/session/e21ca6ebfdbc61a0cbe0b2505a20f5fc673086c1/elements &1|10:25:09.200 tests/BVT/PG-1235.3 &1|10:25:09.200 tests/BVT/PG-1235.3 - data: {"using":"css selector","value":"#select2-drop"} &1|10:25:09.200 - headers: {"Content-Type":"application/json; charset=utf-8","Content-Length":48} &1|10:25:09.200 tests/BVT/PG-1235.3 INFO Response 200 POST http://hub-euw.browserstack.com/wd/hub/session/e21ca6ebfdbc61a0cbe0b2505a20f5fc673086c1/elements (175ms) { sessionId: 'e21ca6ebfdbc61a0cbe0b2505a20f5fc673086c1', &1|10:25:09.200 state: 'success', &1|10:25:09.200 hCode: 2541017, &1|10:25:09.200 value: [], &1|10:25:09.200 class: 'org.openqa.selenium.remote.Response', &1|10:25:09.200 status: 0 } &1|10:25:09.200 tests/BVT/PG-1235.3 INFO Request: POST http://hub-euw.browserstack.com/wd/hub/session/e21ca6ebfdbc61a0cbe0b2505a20f5fc673086c1/elements &1|10:25:09.200 - data: {"using":"css selector","value":"#select2-drop"} &1|10:25:09.200 - headers: {"Content-Type":"application/json; charset=utf-8","Content-Length":48} &1|10:25:09.200 tests/BVT/PG-1235.3 INFO Response 200 POST http://hub-euw.browserstack.com/wd/hub/session/e21ca6ebfdbc61a0cbe0b2505a20f5fc673086c1/elements (185ms) { sessionId: 'e21ca6ebfdbc61a0cbe0b2505a20f5fc673086c1', &1|10:25:09.200 state: 'success', &1|10:25:09.200 hCode: 17937850, &1|10:25:09.200 value: [], &1|10:25:09.200 class: 'org.openqa.selenium.remote.Response', &1|10:25:09.200 status: 0 } &1|10:25:09.200 tests/BVT/PG-1235.3 INFO Request: POST http://hub-euw.browserstack.com/wd/hub/session/e21ca6ebfdbc61a0cbe0b2505a20f5fc673086c1/elements &1|10:25:09.200 tests/BVT/PG-1235.3 - data: {"using":"css selector","value":"#select2-drop"} &1|10:25:09.200 - headers: {"Content-Type":"application/json; charset=utf-8","Content-Length":48} &1|10:25:09.200 tests/BVT/PG-1235.3 INFO Response 200 POST http://hub-euw.browserstack.com/wd/hub/session/e21ca6ebfdbc61a0cbe0b2505a20f5fc673086c1/elements (60ms) { sessionId: 'e21ca6ebfdbc61a0cbe0b2505a20f5fc673086c1', &1|10:25:09.200 state: 'success', &1|10:25:09.200 hCode: 25392974, &1|10:25:09.200 value: [], &1|10:25:09.200 class: 'org.openqa.selenium.remote.Response', &1|10:25:09.200 status: 0 } &1|10:25:09.200 tests/BVT/PG-1235.3 INFO Request: POST http://hub-euw.browserstack.com/wd/hub/session/e21ca6ebfdbc61a0cbe0b2505a20f5fc673086c1/elements &1|10:25:09.200 - data: {"using":"css selector","value":"#select2-drop"} &1|10:25:09.200 - headers: {"Content-Type":"application/json; charset=utf-8","Content-Length":48} &1|10:25:09.200 tests/BVT/PG-1235.3 INFO Response 200 POST http://hub-euw.browserstack.com/wd/hub/session/e21ca6ebfdbc61a0cbe0b2505a20f5fc673086c1/elements (116ms) &1|10:25:09.200 tests/BVT/PG-1235.3 { sessionId: 'e21ca6ebfdbc61a0cbe0b2505a20f5fc673086c1', &1|10:25:09.200 state: 'success', &1|10:25:09.200 hCode: 12321111, &1|10:25:09.200 value: [], &1|10:25:09.200 class: 'org.openqa.selenium.remote.Response', &1|10:25:09.200 status: 0 } &1|10:25:09.200 tests/BVT/PG-1235.3 INFO Request: POST http://hub-euw.browserstack.com/wd/hub/session/e21ca6ebfdbc61a0cbe0b2505a20f5fc673086c1/elements &1|10:25:09.200 - data: {"using":"css selector","value":"#select2-drop"} &1|10:25:09.200 - headers: {"Content-Type":"application/json; charset=utf-8","Content-Length":48} &1|10:25:09.200 tests/BVT/PG-1235.3 INFO Response 200 POST http://hub-euw.browserstack.com/wd/hub/session/e21ca6ebfdbc61a0cbe0b2505a20f5fc673086c1/elements (139ms) { sessionId: 'e21ca6ebfdbc61a0cbe0b2505a20f5fc673086c1', &1|10:25:09.200 state: 'success', &1|10:25:09.200 hCode: 1263710, &1|10:25:09.200 value: [], &1|10:25:09.200 class: 'org.openqa.selenium.remote.Response', &1|10:25:09.200 status: 0 } &1|10:25:09.200 tests/BVT/PG-1235.3 INFO Request: POST http://hub-euw.browserstack.com/wd/hub/session/e21ca6ebfdbc61a0cbe0b2505a20f5fc673086c1/elements &1|10:25:09.200 - data: {"using":"css selector","value":"#select2-drop"} &1|10:25:09.200 - headers: {"Content-Type":"application/json; charset=utf-8","Content-Length":48} &1|10:25:09.200 tests/BVT/PG-1235.3 INFO Response 200 POST http://hub-euw.browserstack.com/wd/hub/session/e21ca6ebfdbc61a0cbe0b2505a20f5fc673086c1/elements (359ms) { sessionId: 'e21ca6ebfdbc61a0cbe0b2505a20f5fc673086c1', &1|10:25:09.200 state: 'success', &1|10:25:09.200 hCode: 8718680, &1|10:25:09.200 value: [], &1|10:25:09.200 class: 'org.openqa.selenium.remote.Response', &1|10:25:09.200 status: 0 } &1|10:25:09.200 tests/BVT/PG-1235.3 INFO Request: POST http://hub-euw.browserstack.com/wd/hub/session/e21ca6ebfdbc61a0cbe0b2505a20f5fc673086c1/elements &1|10:25:09.200 - data: {"using":"css selector","value":"#select2-drop"} &1|10:25:09.200 - headers: {"Content-Type":"application/json; charset=utf-8","Content-Length":48} &1|10:25:09.200 tests/BVT/PG-1235.3 INFO Response 200 POST http://hub-euw.browserstack.com/wd/hub/session/e21ca6ebfdbc61a0cbe0b2505a20f5fc673086c1/elements (189ms) { sessionId: 'e21ca6ebfdbc61a0cbe0b2505a20f5fc673086c1', &1|10:25:09.200 state: 'success', &1|10:25:09.200 hCode: 27600103, &1|10:25:09.200 value: [], &1|10:25:09.200 class: 'org.openqa.selenium.remote.Response', &1|10:25:09.200 status: 0 } &1|10:25:09.200 tests/BVT/PG-1235.3 INFO Request: POST http://hub-euw.browserstack.com/wd/hub/session/e21ca6ebfdbc61a0cbe0b2505a20f5fc673086c1/elements &1|10:25:09.200 - data: {"using":"css selector","value":"#select2-drop"} &1|10:25:09.200 - headers: {"Content-Type":"application/json; charset=utf-8","Content-Length":48} &1|10:25:09.200 tests/BVT/PG-1235.3 INFO Response 200 POST http://hub-euw.browserstack.com/wd/hub/session/e21ca6ebfdbc61a0cbe0b2505a20f5fc673086c1/elements (99ms) { sessionId: 'e21ca6ebfdbc61a0cbe0b2505a20f5fc673086c1', &1|10:25:09.200 state: 'success', &1|10:25:09.200 hCode: 19584925, &1|10:25:09.200 value: [], &1|10:25:09.200 class: 'org.openqa.selenium.remote.Response', &1|10:25:09.200 status: 0 }

Ftelf commented 4 years ago

OK, I have found a clean job with 502 Bad Gateway inside: image image image

So something failed or was not processed, it could be hidden in very long log, and the job is green.

beatfactor commented 4 years ago

I've added a fix so that 5xx server errors encountered during element commands will be always treated as errors (so for instance in case of waitForElementNotPresent, the assertion will result in failure and will be retried according to the "wait for" settings).

beatfactor commented 4 years ago

Added to v1.3.8

Ftelf commented 4 years ago

That's great, any chance to have it backported to 0.9.21 ? We are locked to that version for a while since we are using some custom patches which allows to call any selenium API as sections-aware and also make dynamic selectors super-easy to use. I have tried to port it to newer version, but with no success.

beatfactor commented 4 years ago

Porting to 0.9.21 is very unlikely I'm afraid because there's considering effort involved and there's a large backlog of issues already, but we can consider adding the features which would make it possible to use the apis as you need. Feel free to open a new issue regarding that.