While it makes sense that nodes 5, 6 and 7 would take longer since the size of the data in the cluster is larger after the first 4 node have started, it should not reach this kind of time span.
Logs:
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Log links for testrun with test id 274b80e1-21b2-456d-8602-4f3aac4f5df7 |
+-----------------+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Date | Log type | Link |
+-----------------+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 20211231_162004 | grafana | https://cloudius-jenkins-test.s3.amazonaws.com/274b80e1-21b2-456d-8602-4f3aac4f5df7/20211231_162004/grafana-screenshot-longevity-twcs-48h-test-scylla-per-server-metrics-nemesis-20211231_162231-longevity-twcs-48h-master-monitor-node-274b80e1-1.png |
| 20211231_162004 | grafana | https://cloudius-jenkins-test.s3.amazonaws.com/274b80e1-21b2-456d-8602-4f3aac4f5df7/20211231_162004/grafana-screenshot-overview-20211231_162004-longevity-twcs-48h-master-monitor-node-274b80e1-1.png |
| 20211231_163324 | db-cluster | https://cloudius-jenkins-test.s3.amazonaws.com/274b80e1-21b2-456d-8602-4f3aac4f5df7/20211231_163324/db-cluster-274b80e1.tar.gz |
| 20211231_163324 | loader-set | https://cloudius-jenkins-test.s3.amazonaws.com/274b80e1-21b2-456d-8602-4f3aac4f5df7/20211231_163324/loader-set-274b80e1.tar.gz |
| 20211231_163324 | monitor-set | https://cloudius-jenkins-test.s3.amazonaws.com/274b80e1-21b2-456d-8602-4f3aac4f5df7/20211231_163324/monitor-set-274b80e1.tar.gz |
| 20211231_163324 | sct | https://cloudius-jenkins-test.s3.amazonaws.com/274b80e1-21b2-456d-8602-4f3aac4f5df7/20211231_163324/sct-runner-274b80e1.tar.gz |
+-----------------+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Installation details Scylla version (or git commit hash):4.7.dev-0.20211230.12fa68fe6 with build-id 938d162cf1be1483853bf5b5417f5daf11e5ac24 Cluster size:4 OS (RHEL/CentOS/Ubuntu/AWS AMI):ami-0668e251eb4f0f394 (us-east-1) Scenario: https://github.com/scylladb/scylla-cluster-tests/blob/master/test-cases/longevity/longevity-twcs-48h.yaml
The first 4 nodes in the cluster has started within 2 minutes at most: node 1 (3.250.78.230 | 10.0.3.58):
node 2 (54.246.205.54 | 10.0.1.13):
node 3 (52.214.78.106 | 10.0.3.183):
node 4 (63.35.248.245 | 10.0.2.74):
However, ever since the next node, the init time of each node grew longer:
node 5 (3.249.249.101 | 10.0.2.141):
node 6 (54.154.183.144 | 10.0.1.207):
node 7 (18.203.127.246 | 10.0.2.142):
It seems that the part of the bootstrap process that caused the increase in initialization time is the range_streamer:
node 2 (54.246.205.54 | 10.0.1.13):
node 3 (52.214.78.106 | 10.0.3.183):
node 4 (63.35.248.245 | 10.0.2.74):
node 5 (3.249.249.101 | 10.0.2.141):
node 6 (54.154.183.144 | 10.0.1.207):
node 7 (18.203.127.246 | 10.0.2.142):
While it makes sense that nodes 5, 6 and 7 would take longer since the size of the data in the cluster is larger after the first 4 node have started, it should not reach this kind of time span.
Logs: