scylladb / scylladb

NoSQL data store using the seastar framework, compatible with Apache Cassandra
http://scylladb.com
GNU Affero General Public License v3.0
12.97k stars 1.24k forks source link

`perftune.yaml` is not created on AL2 and cause to Scylla Doctor collector failure #18631

Open juliayakovlev opened 2 months ago

juliayakovlev commented 2 months ago

Packages

Scylla version: 5.5.0~dev-20240509.28791aa2c1d3 with build-id 9e34b204407d7acc91d3980f060cbdfb6a927bdf Kernel Version: 4.14.165-131.185.amzn2.x86_64

Issue description

Recently Scylla Doctor was added to all artifact tests.

Scylla Doctor failed in all amazon2 tests with error:

perftune.yaml is not found

perftune.yaml is really not created.

I tried to run the test with latest amazon2 image (ami-0c9e7cbc8f9f6446b, eu-west-1)

,     #_
   ~\_  ####_        Amazon Linux 2
  ~~  \_#####\
  ~~     \###|       AL2 End of Life is 2025-06-30.
  ~~       \#/ ___
   ~~       V~' '->
    ~~~         /    A newer version of Amazon Linux is available!
      ~~._.   _/
         _/ _/       Amazon Linux 2023, GA and supported until 2028-03-15.
       _/m/'           https://aws.amazon.com/linux/amazon-linux-2023/
[ec2-user@artifacts-amazon2-jenkins-db-node-127dc754-1 ~]$ systemctl --version
systemd 219

The problem still persist - perftune.yaml does not exists.

Looking into journalctl, found:

May 12 13:54:16 artifacts-amazon2-jenkins-db-node-127dc754-1 systemd[1]: [/usr/lib/systemd/system/scylla-server.service:18] Executable path is not absolute, ignoring: +/opt/scylladb/scripts/scylla_prepare
May 12 13:54:16 artifacts-amazon2-jenkins-db-node-127dc754-1 systemd[1]: [/usr/lib/systemd/system/scylla-server.service:20] Executable path is not absolute, ignoring: +/opt/scylladb/scripts/scylla_stop

There is commit that made the change in this part. I do not know if it has the impact here or not.

@roydahan @tzach Should we keep AL2 support or switch to AL3, or keep both, or stop support of Amazon Linux completely?

Installation details

Cluster size: 1 nodes (i4i.large)

Scylla Nodes used in this run:

OS / Image: ami-099a8245f5daa82bf (aws: undefined_region)

Test: artifacts-amazon2-test Test id: dc92ef21-b7b9-4433-afae-130a9800a65d Test name: scylla-master/artifacts/artifacts-amazon2-test Test config file(s):

Logs and commands - Restore Monitor Stack command: `$ hydra investigate show-monitor dc92ef21-b7b9-4433-afae-130a9800a65d` - Restore monitor on AWS instance using [Jenkins job](https://jenkins.scylladb.com/view/QA/job/QA-tools/job/hydra-show-monitor/parambuild/?test_id=dc92ef21-b7b9-4433-afae-130a9800a65d) - Show all stored logs command: `$ hydra investigate show-logs dc92ef21-b7b9-4433-afae-130a9800a65d` ## Logs: - **db-cluster-dc92ef21.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/dc92ef21-b7b9-4433-afae-130a9800a65d/20240509_232345/db-cluster-dc92ef21.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/dc92ef21-b7b9-4433-afae-130a9800a65d/20240509_232345/db-cluster-dc92ef21.tar.gz) - **sct-runner-events-dc92ef21.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/dc92ef21-b7b9-4433-afae-130a9800a65d/20240509_232345/sct-runner-events-dc92ef21.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/dc92ef21-b7b9-4433-afae-130a9800a65d/20240509_232345/sct-runner-events-dc92ef21.tar.gz) - **sct-dc92ef21.log.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/dc92ef21-b7b9-4433-afae-130a9800a65d/20240509_232345/sct-dc92ef21.log.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/dc92ef21-b7b9-4433-afae-130a9800a65d/20240509_232345/sct-dc92ef21.log.tar.gz) [Jenkins job URL](https://jenkins.scylladb.com/job/scylla-master/job/artifacts/job/artifacts-amazon2-test/720/) [Argus](https://argus.scylladb.com/test/2016dc82-62a0-4ce2-9ff0-ef655088ee7a/runs?additionalRuns[]=dc92ef21-b7b9-4433-afae-130a9800a65d)
juliayakovlev commented 2 months ago

@fruch @vladzcloudius FYI

fruch commented 2 months ago

@roydahan cc90ff1646b196ca04ad7d889d83dc5b6a1451af is incompatible with AL2, which has systemd 219. and to your question, there is Amazon Linux 3, they call it Amazon linx 2023 - https://aws.amazon.com/linux/amazon-linux-2023/

so @tzach, we need a call here, do we keep AL2 support (it's EOL in 14 months) ? if yes cc90ff1646b196ca04ad7d889d83dc5b6a1451af need to be reverted

mykaul commented 2 months ago

Not sure there's a lot to triage here. I'd drop support for it.