scylladb / scylla-cluster-tests

Tests for Scylla Clusters
GNU Affero General Public License v3.0
57 stars 94 forks source link

`ERR_ADDRESS_INVALID` when get screenshot in IPv6 test (all addresses, including test communication, are IPv6): #7492

Closed juliayakovlev closed 4 months ago

juliayakovlev commented 5 months ago

ERR_ADDRESS_INVALID when get screenshot in IPv6 test (all addresses, including test communication, are IPv6):

< t:2024-05-26 18:17:49,117 f:remotewebbrowser.py l:106  c:sdcm.utils.remotewebbrowser p:INFO  > Get url http://[2a05:d018:12e3:f002:0418:17b1:04dc:ef71]:3000/d/overview-master/overview?from=1716747164000&to=now&refresh=1d
< t:2024-05-26 18:17:49,817 f:logcollector.py l:522  c:sdcm.logcollector    p:ERROR > Error get screenshot overview: Message: unknown error: net::ERR_ADDRESS_INVALID
< t:2024-05-26 18:17:49,817 f:logcollector.py l:522  c:sdcm.logcollector    p:ERROR >   (Session info: chrome=91.0.4472.114) < t:2024-05-26 18:17:49,817 f:logcollector.py l:522  c:sdcm.logcollector    p:ERROR > Error get screenshot overview: Message: unknown error: net::ERR_ADDRESS_INVALID
  (Session info: chrome=91.0.4472.114)

< t:2024-05-26 18:17:49,817 f:logcollector.py l:522  c:sdcm.logcollector    p:ERROR > Traceback (most recent call last):
< t:2024-05-26 18:17:49,817 f:logcollector.py l:522  c:sdcm.logcollector    p:ERROR >   File "/home/ubuntu/scylla-cluster-tests/sdcm/logcollector.py", line 515, in get_grafana_screenshot
< t:2024-05-26 18:17:49,817 f:logcollector.py l:522  c:sdcm.logcollector    p:ERROR >     self.remote_browser.open(grafana_url, dashboard.resolution)
< t:2024-05-26 18:17:49,817 f:logcollector.py l:522  c:sdcm.logcollector    p:ERROR >   File "/home/ubuntu/scylla-cluster-tests/sdcm/utils/remotewebbrowser.py", line 107, in open
< t:2024-05-26 18:17:49,817 f:logcollector.py l:522  c:sdcm.logcollector    p:ERROR >     self.browser.get(url)
< t:2024-05-26 18:17:49,817 f:logcollector.py l:522  c:sdcm.logcollector    p:ERROR >   File "/usr/local/lib/python3.10/site-packages/selenium/webdriver/remote/webdriver.py", line 333, in get
< t:2024-05-26 18:17:49,817 f:logcollector.py l:522  c:sdcm.logcollector    p:ERROR >     self.execute(Command.GET, {'url': url})
< t:2024-05-26 18:17:49,817 f:logcollector.py l:522  c:sdcm.logcollector    p:ERROR >   File "/usr/local/lib/python3.10/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
< t:2024-05-26 18:17:49,817 f:logcollector.py l:522  c:sdcm.logcollector    p:ERROR >     self.error_handler.check_response(response)
< t:2024-05-26 18:17:49,817 f:logcollector.py l:522  c:sdcm.logcollector    p:ERROR >   File "/usr/local/lib/python3.10/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
< t:2024-05-26 18:17:49,817 f:logcollector.py l:522  c:sdcm.logcollector    p:ERROR >     raise exception_class(message, screen, stacktrace)
< t:2024-05-26 18:17:49,817 f:logcollector.py l:522  c:sdcm.logcollector    p:ERROR > selenium.common.exceptions.WebDriverException: Message: unknown error: net::ERR_ADDRESS_INVALID
< t:2024-05-26 18:17:49,817 f:logcollector.py l:522  c:sdcm.logcollector    p:ERROR >   (Session info: chrome=91.0.4472.114)
< t:2024-05-26 18:17:49,817 f:cluster_aws.py  l:590  c:sdcm.cluster_aws     p:DEBUG > Node parallel-topology-schema-changes-mu-monitor-node-54e412f0-1 [3.255.226.192 | 10.4.10.159 | 2a05:d018:12e3:f002:0418:17b1:04dc:ef71] (dc name: eu-west-1): external_address is: 2a05:d018:12e3:f002:0418:17b1:04dc:ef71
< t:2024-05-26 18:17:49,825 f:cluster_aws.py  l:590  c:sdcm.cluster_aws     p:DEBUG > Node parallel-topology-schema-changes-mu-monitor-node-54e412f0-1 [3.255.226.192 | 10.4.10.159 | 2a05:d018:12e3:f002:0418:17b1
:04dc:ef71] (dc name: eu-west-1): external_address is: 2a05:d018:12e3:f002:0418:17b1:04dc:ef71
< t:2024-05-26 18:17:49,825 f:remotewebbrowser.py l:104  c:sdcm.utils.remotewebbrowser p:INFO  > Set resolution 1920px*15000px
< t:2024-05-26 18:17:49,939 f:remotewebbrowser.py l:106  c:sdcm.utils.remotewebbrowser p:INFO  > Get url http://[2a05:d018:12e3:f002:0418:17b1:04dc:ef71]:3000/d/ae8d28c1-a270-452d-ade8-bb2e39fe57a8/7d18b9a0-7264-542a-8bef-0a6bd69764f5?from=1716747164000&to=now&refresh=1d
< t:2024-05-26 18:17:50,440 f:logcollector.py l:522  c:sdcm.logcollector    p:ERROR > Error get screenshot longevity-multidc-schema-topology-changes-ipv6-12h-scylla-per-server-metrics-nemesis: Message: unknown error: net::ERR_ADDRESS_INVALID
< t:2024-05-26 18:17:50,440 f:logcollector.py l:522  c:sdcm.logcollector    p:ERROR >   (Session info: chrome=91.0.4472.114) < t:2024-05-26 18:17:50,440 f:logcollector.py l:522  c:sdcm.logcollector    p:ERROR > Error get screenshot longevity-multidc-schema-topology-changes-ipv6-12h-scylla-per-server-metrics-nemesis: Message: unknown error: net::ERR_ADDRESS_INVALID
  (Session info: chrome=91.0.4472.114)

It is old problem. I see same in the run 2023-03-02 https://argus.scylladb.com/test/d6286774-7a9d-4219-9124-316bd7674e46/runs?additionalRuns[]=13f8b9f3-c649-406b-a149-5f4af08b7198

Screenshots are missed in all those runs

Packages

Scylla version: 6.1.0~dev-20240525.de798775fd30 with build-id dfcf4bd5b30e981b5da67aa0e57b9821f52d5744

Kernel Version: 5.15.0-1062-aws

Issue description

Describe your issue in detail and steps it took to produce it.

Impact

Describe the impact this issue causes to the user.

How frequently does it reproduce?

Describe the frequency with how this issue can be reproduced.

Installation details

Cluster size: 12 nodes (i3en.2xlarge)

Scylla Nodes used in this run:

OS / Image: ami-01692a5408b2601d0 ami-00de0108f84a7121d (aws: undefined_region)

Test: longevity-multidc-schema-topology-changes-ipv6-12h Test id: 54e412f0-b21c-4bc1-945f-9ee4ed0af7cd Test name: scylla-staging/yulia/Network/longevity-multidc-schema-topology-changes-ipv6-12h Test config file(s):

Logs and commands - Restore Monitor Stack command: `$ hydra investigate show-monitor 54e412f0-b21c-4bc1-945f-9ee4ed0af7cd` - Restore monitor on AWS instance using [Jenkins job](https://jenkins.scylladb.com/view/QA/job/QA-tools/job/hydra-show-monitor/parambuild/?test_id=54e412f0-b21c-4bc1-945f-9ee4ed0af7cd) - Show all stored logs command: `$ hydra investigate show-logs 54e412f0-b21c-4bc1-945f-9ee4ed0af7cd` ## Logs: - **COREDUMP-Node parallel-topology-schema-changes-mu-db-node-54e412f0-11 [16.16.149.113 | 10.0.10.184 | 2a05:d016:038c:4a02:2d41:9fa2:a7f3:de12] (dc name: eu-northscylla_node_north)** - [https://storage.cloud.google.com/upload.scylladb.com/core.scylla.112.b2fbbcc2ccdb401094763061ab77745e.7272.1716747624000000/core.scylla.112.b2fbbcc2ccdb401094763061ab77745e.7272.1716747624000000.gz](https://storage.cloud.google.com/upload.scylladb.com/core.scylla.112.b2fbbcc2ccdb401094763061ab77745e.7272.1716747624000000/core.scylla.112.b2fbbcc2ccdb401094763061ab77745e.7272.1716747624000000.gz) - **COREDUMP-Node parallel-topology-schema-changes-mu-db-node-54e412f0-5 [34.240.184.68 | 10.4.11.236 | 2a05:d018:12e3:f002:28da:c72c:a25a:fb59] (dc name: eu-westscylla_node_west)** - [https://storage.cloud.google.com/upload.scylladb.com/core.scylla.112.22630a9febd44a61a0bf5572542fc51c.10024.1716755710000000/core.scylla.112.22630a9febd44a61a0bf5572542fc51c.10024.1716755710000000.gz](https://storage.cloud.google.com/upload.scylladb.com/core.scylla.112.22630a9febd44a61a0bf5572542fc51c.10024.1716755710000000/core.scylla.112.22630a9febd44a61a0bf5572542fc51c.10024.1716755710000000.gz) - **db-cluster-54e412f0.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/54e412f0-b21c-4bc1-945f-9ee4ed0af7cd/20240526_214512/db-cluster-54e412f0.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/54e412f0-b21c-4bc1-945f-9ee4ed0af7cd/20240526_214512/db-cluster-54e412f0.tar.gz) - **sct-runner-events-54e412f0.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/54e412f0-b21c-4bc1-945f-9ee4ed0af7cd/20240526_214512/sct-runner-events-54e412f0.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/54e412f0-b21c-4bc1-945f-9ee4ed0af7cd/20240526_214512/sct-runner-events-54e412f0.tar.gz) - **sct-54e412f0.log.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/54e412f0-b21c-4bc1-945f-9ee4ed0af7cd/20240526_214512/sct-54e412f0.log.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/54e412f0-b21c-4bc1-945f-9ee4ed0af7cd/20240526_214512/sct-54e412f0.log.tar.gz) - **loader-set-54e412f0.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/54e412f0-b21c-4bc1-945f-9ee4ed0af7cd/20240526_214512/loader-set-54e412f0.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/54e412f0-b21c-4bc1-945f-9ee4ed0af7cd/20240526_214512/loader-set-54e412f0.tar.gz) - **monitor-set-54e412f0.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/54e412f0-b21c-4bc1-945f-9ee4ed0af7cd/20240526_214512/monitor-set-54e412f0.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/54e412f0-b21c-4bc1-945f-9ee4ed0af7cd/20240526_214512/monitor-set-54e412f0.tar.gz) [Jenkins job URL](https://jenkins.scylladb.com/job/scylla-staging/job/yulia/job/Network/job/longevity-multidc-schema-topology-changes-ipv6-12h/8/) [Argus](https://argus.scylladb.com/test/1c6a8408-05db-4dcb-8960-baee6b6eef82/runs?additionalRuns[]=54e412f0-b21c-4bc1-945f-9ee4ed0af7cd)
fruch commented 5 months ago

Should be fixed when we'll stop using webdriver for screenshot @soyacz here's one more reason for this replacement

fruch commented 5 months ago

see https://github.com/scylladb/scylla-cluster-tests/issues/7135

fruch commented 5 months ago

We might need to enable ipv6 for docker for this to work: https://docker-docs.uclv.cu/config/daemon/ipv6/

juliayakovlev commented 5 months ago

Also log collector in finalize stage fails to collect screenshots with timeout.

Using public IP:

19:11:05  Save a screenshot of http://34.252.46.35:3000/d/afb9a7b9-e72a-45cd-81da-fb4e31b5298c/7f8806a5-f243-5ead-9616-3dfabd5a21d6?from=1716804552000&to=now&refresh=1d to /home/ubuntu/sct-results/20240527-153948-994528/collected_logs/20240527_160912/monitor-set-e09305d4/longevity-10gb-3h-ipv6-url-monitor-node-e09305d4-1/grafana-screenshot-longevity-10gb-3h-ipv6-test-scylla-per-server-metrics-nemesis-20240527_161050-longevity-10gb-3h-ipv6-url-monitor-node-e09305d4-1.png
19:11:09  Grafana - browser quit
19:11:40  open url: http://34.252.46.35:3000/login
19:11:42  Login to grafana with default credentials
19:11:44  Logged in succesful
19:11:44  Get snapshot link for url http://34.252.46.35:3000/d/overview-master/overview?from=1716804552000&to=now&refresh=1d
19:11:44  Set resolution 1920px*4000px
19:11:44  Get url http://34.252.46.35:3000/d/overview-master/overview?from=1716804552000&to=now&refresh=1d
19:14:59  Error get snapshot overview: Message: 
19:14:59  , traceback: Traceback (most recent call last):
19:14:59    File "/home/ubuntu/scylla-cluster-tests/sdcm/logcollector.py", line 577, in get_grafana_snapshot
19:14:59      dashboard.scroll_to_bottom(self.remote_browser.browser)
19:14:59    File "/home/ubuntu/scylla-cluster-tests/sdcm/monitorstack/ui.py", line 129, in scroll_to_bottom
19:14:59      WebDriverWait(remote_browser, UI_ELEMENT_LOAD_TIMEOUT).until(
19:14:59    File "/usr/local/lib/python3.10/site-packages/selenium/webdriver/support/wait.py", line 80, in until
19:14:59      raise TimeoutException(message, screen, stacktrace)
19:14:59  selenium.common.exceptions.TimeoutException: Message: 

https://argus.scylladb.com/test/f4a292e9-e796-4467-91c2-2c2f9a77c96b/runs?additionalRuns[]=e09305d4-c346-46dd-97bd-566753151add

Using private IP:

16:36:23  Save a screenshot of http://10.4.3.85:3000/d/f3aaccc1-922a-4008-b175-719eab35f7ee/7f8806a5-f243-5ead-9616-3dfabd5a21d6?from=1716795244000&to=now&refresh=1d to /home/ubuntu/sct-results/20240527-115856-750141/collected_logs/20240527_133404/monitor-set-7e13a342/longevity-10gb-3h-ipv6-url-monitor-node-7e13a342-1/grafana-screenshot-longevity-10gb-3h-ipv6-test-scylla-per-server-metrics-nemesis-20240527_133606-longevity-10gb-3h-ipv6-url-monitor-node-7e13a342-1.png
16:36:27  Grafana - browser quit
16:37:03  open url: http://10.4.3.85:3000/login
16:37:03  Login to grafana with default credentials
16:37:03  Logged in succesful
16:37:03  Get snapshot link for url http://10.4.3.85:3000/d/overview-master/overview?from=1716795244000&to=now&refresh=1d
16:37:03  Set resolution 1920px*4000px
16:37:03  Get url http://10.4.3.85:3000/d/overview-master/overview?from=1716795244000&to=now&refresh=1d
16:40:16  Error get snapshot overview: Message: 
16:40:16  , traceback: Traceback (most recent call last):
16:40:16    File "/home/ubuntu/scylla-cluster-tests/sdcm/logcollector.py", line 577, in get_grafana_snapshot
16:40:16      dashboard.scroll_to_bottom(self.remote_browser.browser)
16:40:16    File "/home/ubuntu/scylla-cluster-tests/sdcm/monitorstack/ui.py", line 129, in scroll_to_bottom
16:40:16      WebDriverWait(remote_browser, UI_ELEMENT_LOAD_TIMEOUT).until(
16:40:16    File "/usr/local/lib/python3.10/site-packages/selenium/webdriver/support/wait.py", line 80, in until
16:40:16      raise TimeoutException(message, screen, stacktrace)
16:40:16  selenium.common.exceptions.TimeoutException: Message: 

https://argus.scylladb.com/test/f4a292e9-e796-4467-91c2-2c2f9a77c96b/runs?additionalRuns[]=7e13a342-f65a-4e25-9b56-11c24123ad87

fruch commented 4 months ago

7503 fixes the issue, since this webdriver isn't being used anymore