Closed fruch closed 1 year ago
looks like we are calling disable_daily_triggered_services
only for db nodes, so the distro can still decide it want to do upgrades while the test want to install things
I think we should consider calling disable_daily_triggered_services across all node setups we are doing
Received similar in https://jenkins.scylladb.com/job/enterprise-2023.1/job/artifacts/job/artifacts-ubuntu2004-fips-test/
2023-10-01 18:13:15.735: (TestFrameworkEvent Severity.ERROR) period_type=one-time event_id=4c82d8a6-21f5-4c3a-8b57-04488277f413, source=ArtifactsTest.SetUp()
exception=[Node artifacts-ubuntu2004-fips-jenkins-db-node-d289941e-1 [44.199.236.142 | 10.12.0.192] (seed: True)] NodeSetupFailed: Encountered a bad command exit code!
Command: 'sudo bash -cxe "\nexport DEBIAN_FRONTEND=noninteractive\napt-get install software-properties-common -y\n"'
Exit code: 100
Stdout:
Stderr:
+ export DEBIAN_FRONTEND=noninteractive
+ DEBIAN_FRONTEND=noninteractive
+ apt-get install software-properties-common -y
E: Could not get lock /var/lib/dpkg/lock-frontend. It is held by process 3901 (apt-get)
E: Unable to acquire the dpkg frontend lock (/var/lib/dpkg/lock-frontend), is another process using it?
Traceback (most recent call last):
File "/tmp/jenkins/workspace/enterprise-2023.1/artifacts/artifacts-ubuntu2004-fips-test/scylla-cluster-tests/sdcm/cluster.py", line 3711, in node_setup
cl_inst.node_setup(_node, **setup_kwargs)
File "/tmp/jenkins/workspace/enterprise-2023.1/artifacts/artifacts-ubuntu2004-fips-test/scylla-cluster-tests/sdcm/cluster.py", line 4395, in node_setup
self._scylla_install(node)
File "/tmp/jenkins/workspace/enterprise-2023.1/artifacts/artifacts-ubuntu2004-fips-test/scylla-cluster-tests/sdcm/cluster.py", line 4477, in _scylla_install
node.install_scylla(scylla_repo=self.params.get('scylla_repo'))
File "/tmp/jenkins/workspace/enterprise-2023.1/artifacts/artifacts-ubuntu2004-fips-test/scylla-cluster-tests/sdcm/cluster.py", line 1945, in install_scylla
self.remoter.run('sudo bash -cxe "%s"' % install_prereqs)
File "/tmp/jenkins/workspace/enterprise-2023.1/artifacts/artifacts-ubuntu2004-fips-test/scylla-cluster-tests/sdcm/remote/remote_base.py", line 613, in run
result = _run()
File "/tmp/jenkins/workspace/enterprise-2023.1/artifacts/artifacts-ubuntu2004-fips-test/scylla-cluster-tests/sdcm/utils/decorators.py", line 67, in inner
return func(*args, **kwargs)
File "/tmp/jenkins/workspace/enterprise-2023.1/artifacts/artifacts-ubuntu2004-fips-test/scylla-cluster-tests/sdcm/remote/remote_base.py", line 604, in _run
return self._run_execute(cmd, timeout, ignore_status, verbose, new_session, watchers)
File "/tmp/jenkins/workspace/enterprise-2023.1/artifacts/artifacts-ubuntu2004-fips-test/scylla-cluster-tests/sdcm/remote/remote_base.py", line 537, in _run_execute
result = connection.run(**command_kwargs)
File "/tmp/jenkins/workspace/enterprise-2023.1/artifacts/artifacts-ubuntu2004-fips-test/scylla-cluster-tests/sdcm/remote/libssh2_client/__init__.py", line 620, in run
return self._complete_run(channel, exception, timeout_reached, timeout, result, warn, stdout, stderr)
File "/tmp/jenkins/workspace/enterprise-2023.1/artifacts/artifacts-ubuntu2004-fips-test/scylla-cluster-tests/sdcm/remote/libssh2_client/__init__.py", line 654, in _complete_run
raise UnexpectedExit(result)
sdcm.remote.libssh2_client.exceptions.UnexpectedExit: Encountered a bad command exit code!
Command: 'sudo bash -cxe "\nexport DEBIAN_FRONTEND=noninteractive\napt-get install software-properties-common -y\n"'
Exit code: 100
Stdout:
Stderr:
+ export DEBIAN_FRONTEND=noninteractive
+ DEBIAN_FRONTEND=noninteractive
+ apt-get install software-properties-common -y
E: Could not get lock /var/lib/dpkg/lock-frontend. It is held by process 3901 (apt-get)
E: Unable to acquire the dpkg frontend lock (/var/lib/dpkg/lock-frontend), is another process using it?
Describe your issue in detail and steps it took to produce it.
Describe the impact this issue causes to the user.
Describe the frequency with how this issue can be reproduced.
Cluster size: 1 nodes (i3.large)
Scylla Nodes used in this run:
OS / Image: ami-03cf7ddd346310b5f
(aws: us-east-1)
Test: artifacts-ubuntu2004-fips-test
Test id: d289941e-9de9-4693-9f01-3b836a0dc602
Test name: enterprise-2023.1/artifacts/artifacts-ubuntu2004-fips-test
Test config file(s):
@fruch why is it in "Waiting for Review"? What review? :)
@fruch why is it in "Waiting for Review"? What review? :)
It's a mix
This is solved in: https://github.com/scylladb/scylla-cluster-tests/pull/6759
Issue description
monitor seems to fail to do it's setup/installations, cause dpk lock is held by 3930 (unattended-upgr)
How frequently does it reproduce?
Seen only once, so far
Installation details
Cluster size: 6 nodes (i4i.4xlarge)
Scylla Nodes used in this run:
OS / Image:
ami-006f6d3d045731dd6
(aws: undefined_region)Test:
longevity-100gb-4h-test
Test id:62931180-dfdd-4aa9-b1cd-476dcd8d3600
Test name:scylla-master/longevity/longevity-100gb-4h-test
Test config file(s):Logs and commands
- Restore Monitor Stack command: `$ hydra investigate show-monitor 62931180-dfdd-4aa9-b1cd-476dcd8d3600` - Restore monitor on AWS instance using [Jenkins job](https://jenkins.scylladb.com/view/QA/job/QA-tools/job/hydra-show-monitor/parambuild/?test_id=62931180-dfdd-4aa9-b1cd-476dcd8d3600) - Show all stored logs command: `$ hydra investigate show-logs 62931180-dfdd-4aa9-b1cd-476dcd8d3600` ## Logs: - **db-cluster-62931180.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/62931180-dfdd-4aa9-b1cd-476dcd8d3600/20230903_061734/db-cluster-62931180.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/62931180-dfdd-4aa9-b1cd-476dcd8d3600/20230903_061734/db-cluster-62931180.tar.gz) - **sct-runner-events-62931180.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/62931180-dfdd-4aa9-b1cd-476dcd8d3600/20230903_061734/sct-runner-events-62931180.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/62931180-dfdd-4aa9-b1cd-476dcd8d3600/20230903_061734/sct-runner-events-62931180.tar.gz) - **sct-62931180.log.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/62931180-dfdd-4aa9-b1cd-476dcd8d3600/20230903_061734/sct-62931180.log.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/62931180-dfdd-4aa9-b1cd-476dcd8d3600/20230903_061734/sct-62931180.log.tar.gz) - **loader-set-62931180.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/62931180-dfdd-4aa9-b1cd-476dcd8d3600/20230903_061734/loader-set-62931180.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/62931180-dfdd-4aa9-b1cd-476dcd8d3600/20230903_061734/loader-set-62931180.tar.gz) - **monitor-set-62931180.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/62931180-dfdd-4aa9-b1cd-476dcd8d3600/20230903_061734/monitor-set-62931180.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/62931180-dfdd-4aa9-b1cd-476dcd8d3600/20230903_061734/monitor-set-62931180.tar.gz) - **parallel-timelines-report-62931180.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/62931180-dfdd-4aa9-b1cd-476dcd8d3600/20230903_061734/parallel-timelines-report-62931180.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/62931180-dfdd-4aa9-b1cd-476dcd8d3600/20230903_061734/parallel-timelines-report-62931180.tar.gz) [Jenkins job URL](https://jenkins.scylladb.com/job/scylla-master/job/longevity/job/longevity-100gb-4h-test/676/) [Argus](https://argus.scylladb.com/test/f05fea04-eb74-4961-94fb-f71c67df52cb/runs?additionalRuns[]=62931180-dfdd-4aa9-b1cd-476dcd8d3600)