scylladb / scylla-machine-image

Apache License 2.0
18 stars 25 forks source link

Disable EOL scylla repositories in k8s node setup image #442

Closed zimnx closed 1 year ago

zimnx commented 1 year ago

K8s node setup image which was created 2 years ago was used in scylla-operator EKS example. This image was supposed to set up a RAID array, create XFS filesystem and mount it in desired location using helper scripts taken from this repository and from scylla image. Image was based on Scylla 4.1 which some time ago ended it's life and it's repository was removed. This cause failures for k8s image users becasue repository added in base image is no longer there.

Loaded plugins: fastestmirror, ovl
Determining fastest mirrors
 * base: download.cf.centos.org
 * epel: d2lzkl7pfhq30w.cloudfront.net
 * extras: download.cf.centos.org
 * updates: download.cf.centos.org
http://downloads.scylladb.com/rpm/unstable/centos/branch-4.1/2020-08-31T00%3A50%3A11Z/scylla/x86_64/repodata/repomd.xml: [Errno 14] HTTP Error 404 - Not Found
Trying other mirror.
To address this issue please refer to the below wiki article 

https://wiki.centos.org/yum-errors

If above article doesn't help to resolve this issue please use https://bugs.centos.org/.

 One of the configured repositories failed (Scylla for Centos 7 - x86_64),
 and yum doesn't have enough cached data to continue. At this point the only
 safe thing yum can do is fail. There are a few ways to work "fix" this:

     1. Contact the upstream for the repository and get them to fix the problem.

     2. Reconfigure the baseurl/etc. for the repository, to point to a working
        upstream. This is most often useful if you are using a newer
        distribution release than is supported by the repository (and the
        packages for the previous distribution release still work).

     3. Run the command with the repository temporarily disabled
            yum --disablerepo=scylla ...

     4. Disable the repository permanently, so yum won't use it by default. Yum
        will then just ignore the repository until you permanently enable it
        again or use --enablerepo for temporary usage:

            yum-config-manager --disable scylla
        or
            subscription-manager repos --disable=scylla

     5. Configure the failing repository to be skipped, if it is unavailable.
        Note that yum will try to contact the repo. when it runs most commands,
        so will have to try and fail each time (and thus. yum will be be much
        slower). If it is a very temporary problem though, this is often a nice
        compromise:

            yum-config-manager --save --setopt=scylla.skip_if_unavailable=true

failure: repodata/repomd.xml from scylla: [Errno 256] No more mirrors to try.
http://downloads.scylladb.com/rpm/unstable/centos/branch-4.1/2020-08-31T00:50:11Z/scylla/x86_64/repodata/repomd.xml: [Errno 14] HTTP Error 404 - Not Found
Traceback (most recent call last):
  File "/opt/scylladb/scripts/libexec/scylla_raid_setup", line 86, in <module>
    run('yum install -y mdadm xfsprogs')
  File "/opt/scylladb/scripts/scylla_util.py", line 340, in run
    return subprocess.check_call(cmd, shell=shell, stdout=stdout, stderr=stderr, env=scylla_env)
  File "/opt/scylladb/python3/lib64/python3.7/subprocess.py", line 363, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['yum', 'install', '-y', 'mdadm', 'xfsprogs']' returned non-zero exit status 1.
RAID Array containing ['nvme0n1', 'nvme1n1'] not found. Creating...
Traceback (most recent call last):
  File "/opt/scylladb/scylla-machine-image/scylla_create_devices", line 194, in <module>
    get_disk_bundles()
  File "/opt/scylladb/scylla-machine-image/scylla_create_devices", line 183, in get_disk_bundles
    config_array(typemap[t], role[t], mdidx)
  File "/opt/scylladb/scylla-machine-image/scylla_create_devices", line 80, in config_array
    "--update-fstab"], check=True)
  File "/opt/scylladb/python3/lib64/python3.7/subprocess.py", line 512, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['/opt/scylladb/scripts/scylla_raid_setup', '--raiddev', '/dev/md0', '--disks', '/dev/nvme0n1,/dev/nvme1n1', '--root', '/mnt/hostfs/mnt/raid-disks/disk0', '--volume-role', 'all', '--update-fstab']' returned non-zero exit status 1.
Traceback (most recent call last):
  File "/opt/scylladb/scylla-machine-image/scylla_k8s_node_setup", line 65, in <module>
    run('/opt/scylladb/scylla-machine-image/scylla_create_devices --scylla-data-root {}'.format(root_disk))
  File "/opt/scylladb/scripts/scylla_util.py", line 340, in run
    return subprocess.check_call(cmd, shell=shell, stdout=stdout, stderr=stderr, env=scylla_env)
  File "/usr/lib64/python3.6/subprocess.py", line 311, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/opt/scylladb/scylla-machine-image/scylla_create_devices', '--scylla-data-root', '/mnt/hostfs/mnt/raid-disks/disk0']' returned non-zero exit status 1.

Because scylla_create_devices script was moved and changed to the point it's no longer working with k8s node setup, an old version was brought back into the k8s directory just for AWS. GCE image is not used anywhere, hence I fixed it just for AWS.

This image is going to be removed soon, as different solution for disk setup is coming to scylla-operator. This PR is just a hot-fix to users and QA which used this image in the past and now are stuck.

Xref: https://github.com/scylladb/scylla-operator/issues/1218

zimnx commented 1 year ago

@yaronkaikov PTAL, we need this image published to unblock our users and avoid a workaround in QA code.

zimnx commented 1 year ago

@yaronkaikov ping

yaronkaikov commented 1 year ago

@zimnx Sorry for the delay , missed this notification

zimnx commented 1 year ago

@zimnx Sorry for the delay , missed this notification

no worries, thanks for merging. Would you mind pushing this image to docker hub? I don't have permissions there.