Closed svetsa-rh closed 2 years ago
This is the first of a series of PRs to make the benchmark-operator and e2e-benchmarking repos aarch64 compatible
Let us know if it preferable to copy this image to quay.io/cloud-bulldozer to avoid docker.io rate limiting
Hi @mffiedler, docker.io rate limiter always is a problem, but IIRC, the image bitnami/redis:latest is already in docker.io right?
FYI: I just copied the redis repository (with all the archs) to the cloud-bulldozer org in quay: https://quay.io/repository/cloud-bulldozer/redis?tab=tags
I think we can switch to it in this PR (quay.io/cloud-bulldozer/redis:latest)
@svetsa-rh Please update this PR to point benchmark-operator to the image uploaded by Raul.
Saw these redis related errors in the log
[pod/kube-burner-ca652ecd-lfffs/backpack] Traceback (most recent call last):
[pod/kube-burner-ca652ecd-lfffs/backpack] File "stockpile-wrapper.py", line 257, in <module>
[pod/kube-burner-ca652ecd-lfffs/backpack] sys.exit(main())
[pod/kube-burner-ca652ecd-lfffs/backpack] File "stockpile-wrapper.py", line 224, in main
[pod/kube-burner-ca652ecd-lfffs/backpack] run = _mark_node(r, my_node, my_uuid, es, check_val)
[pod/kube-burner-ca652ecd-lfffs/backpack] File "stockpile-wrapper.py", line 161, in _mark_node
[pod/kube-burner-ca652ecd-lfffs/backpack] r.set(check_val, "Metadata-Running")
[pod/kube-burner-ca652ecd-lfffs/backpack] File "/usr/local/lib/python3.6/site-packages/redis/commands/core.py", line 2127, in set
[pod/kube-burner-ca652ecd-lfffs/backpack] return self.execute_command("SET", *pieces, **options)
[pod/kube-burner-ca652ecd-lfffs/backpack] File "/usr/local/lib/python3.6/site-packages/redis/client.py", line 1222, in execute_command
[pod/kube-burner-ca652ecd-lfffs/backpack] lambda error: self._disconnect_raise(conn, error),
[pod/kube-burner-ca652ecd-lfffs/backpack] File "/usr/local/lib/python3.6/site-packages/redis/retry.py", line 45, in call_with_retry
[pod/kube-burner-ca652ecd-lfffs/backpack] return do()
[pod/kube-burner-ca652ecd-lfffs/backpack] File "/usr/local/lib/python3.6/site-packages/redis/client.py", line 1220, in <lambda>
[pod/kube-burner-ca652ecd-lfffs/backpack] conn, command_name, *args, **options
[pod/kube-burner-ca652ecd-lfffs/backpack] File "/usr/local/lib/python3.6/site-packages/redis/client.py", line 1195, in _send_command_parse_response
[pod/kube-burner-ca652ecd-lfffs/backpack] return self.parse_response(conn, command_name, **options)
[pod/kube-burner-ca652ecd-lfffs/backpack] File "/usr/local/lib/python3.6/site-packages/redis/client.py", line 1234, in parse_response
[pod/kube-burner-ca652ecd-lfffs/backpack] response = connection.read_response()
[pod/kube-burner-ca652ecd-lfffs/backpack] File "/usr/local/lib/python3.6/site-packages/redis/connection.py", line 836, in read_response
[pod/kube-burner-ca652ecd-lfffs/backpack] raise response
[pod/kube-burner-ca652ecd-lfffs/backpack] redis.exceptions.ResponseError: MISCONF Redis is configured to save RDB snapshots, but it is currently not able to persist on disk. Commands that may modify the data set are disabled, because this instance is configured to report errors during writes if RDB snapshotting fails (stop-writes-on-bgsave-error option). Please check the Redis logs for details about the RDB error.
``
This is a duplicate PR, or is the other?
The other one is a duplicate. Sorry about that. Closed that one.
Saw these redis related errors in the log
[pod/kube-burner-ca652ecd-lfffs/backpack] Traceback (most recent call last): [pod/kube-burner-ca652ecd-lfffs/backpack] File "stockpile-wrapper.py", line 257, in <module> [pod/kube-burner-ca652ecd-lfffs/backpack] sys.exit(main()) [pod/kube-burner-ca652ecd-lfffs/backpack] File "stockpile-wrapper.py", line 224, in main [pod/kube-burner-ca652ecd-lfffs/backpack] run = _mark_node(r, my_node, my_uuid, es, check_val) [pod/kube-burner-ca652ecd-lfffs/backpack] File "stockpile-wrapper.py", line 161, in _mark_node [pod/kube-burner-ca652ecd-lfffs/backpack] r.set(check_val, "Metadata-Running") [pod/kube-burner-ca652ecd-lfffs/backpack] File "/usr/local/lib/python3.6/site-packages/redis/commands/core.py", line 2127, in set [pod/kube-burner-ca652ecd-lfffs/backpack] return self.execute_command("SET", *pieces, **options) [pod/kube-burner-ca652ecd-lfffs/backpack] File "/usr/local/lib/python3.6/site-packages/redis/client.py", line 1222, in execute_command [pod/kube-burner-ca652ecd-lfffs/backpack] lambda error: self._disconnect_raise(conn, error), [pod/kube-burner-ca652ecd-lfffs/backpack] File "/usr/local/lib/python3.6/site-packages/redis/retry.py", line 45, in call_with_retry [pod/kube-burner-ca652ecd-lfffs/backpack] return do() [pod/kube-burner-ca652ecd-lfffs/backpack] File "/usr/local/lib/python3.6/site-packages/redis/client.py", line 1220, in <lambda> [pod/kube-burner-ca652ecd-lfffs/backpack] conn, command_name, *args, **options [pod/kube-burner-ca652ecd-lfffs/backpack] File "/usr/local/lib/python3.6/site-packages/redis/client.py", line 1195, in _send_command_parse_response [pod/kube-burner-ca652ecd-lfffs/backpack] return self.parse_response(conn, command_name, **options) [pod/kube-burner-ca652ecd-lfffs/backpack] File "/usr/local/lib/python3.6/site-packages/redis/client.py", line 1234, in parse_response [pod/kube-burner-ca652ecd-lfffs/backpack] response = connection.read_response() [pod/kube-burner-ca652ecd-lfffs/backpack] File "/usr/local/lib/python3.6/site-packages/redis/connection.py", line 836, in read_response [pod/kube-burner-ca652ecd-lfffs/backpack] raise response [pod/kube-burner-ca652ecd-lfffs/backpack] redis.exceptions.ResponseError: MISCONF Redis is configured to save RDB snapshots, but it is currently not able to persist on disk. Commands that may modify the data set are disabled, because this instance is configured to report errors during writes if RDB snapshotting fails (stop-writes-on-bgsave-error option). Please check the Redis logs for details about the RDB error. ``
In our attempts to get scale-ci tests to run on ARM architecture, after our sanity tests passed and discussions between @mffiedler and I, I have opened this PR to update config manager yaml file to change from bitnami/redis:latest to docker.io/redis. This was decided based on our initial attempts/challenges of building a bitnami/redis arm image vs promising test results of redis arm image readily available at docker.io.
Based on Raul's note above, upon further testing I found that containers created using bitnami/redis have no trouble writing snapshot files to disk while containers created using docker.io has issues.
Output comparison:
Log output when using bitnami/redis: redis 17:02:55.01 INFO ==> Starting Redis 1:C 05 May 2022 17:02:55.020 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo 1:C 05 May 2022 17:02:55.020 # Redis version=6.2.7, bits=64, commit=00000000, modified=0, pid=1, just started 1:C 05 May 2022 17:02:55.020 # Configuration loaded 1:M 05 May 2022 17:02:55.021 monotonic clock: POSIX clock_gettime 1:M 05 May 2022 17:02:55.021 # A key 'rediscompare_helper' was added to Lua globals which is not on the globals allow list nor listed on the deny list. 1:M 05 May 2022 17:02:55.021 Running mode=standalone, port=6379. 1:M 05 May 2022 17:02:55.021 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128. 1:M 05 May 2022 17:02:55.021 # Server initialized 1:M 05 May 2022 17:02:55.021 Ready to accept connections 1:signal-handler (1651770700) Received SIGTERM scheduling shutdown... 1:M 05 May 2022 17:11:40.355 # User requested shutdown... 1:M 05 May 2022 17:11:40.355 Calling fsync() on the AOF file. 1:M 05 May 2022 17:11:40.355 Saving the final RDB snapshot before exiting. 1:M 05 May 2022 17:11:40.355 DB saved on disk 1:M 05 May 2022 17:11:40.355 * Removing the pid file. 1:M 05 May 2022 17:11:40.355 # Redis is now ready to exit, bye bye... [root@db5be9b29a4e Development]#
Log output when using docker.io/redis (a.k.a quay.io/cloud-bulldozer/redis:latest): [root@c442abdbf09a Development]# oc logs -f benchmark-controller-manager-df9b6cb7f-r7lzh -c redis-master 1:C 05 May 2022 00:39:21.985 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo 1:C 05 May 2022 00:39:21.985 # Redis version=6.2.6, bits=64, commit=00000000, modified=0, pid=1, just started 1:C 05 May 2022 00:39:21.985 # Warning: no config file specified, using the default config. In order to specify a config file use redis-server /path/to/redis.conf 1:M 05 May 2022 00:39:21.986 monotonic clock: POSIX clock_gettime 1:M 05 May 2022 00:39:21.986 Running mode=standalone, port=6379. 1:M 05 May 2022 00:39:21.986 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128. 1:M 05 May 2022 00:39:21.986 # Server initialized 1:M 05 May 2022 00:39:21.987 Ready to accept connections 1:signal-handler (1651711584) Received SIGTERM scheduling shutdown... 1:M 05 May 2022 00:46:24.427 # User requested shutdown... 1:M 05 May 2022 00:46:24.427 Saving the final RDB snapshot before exiting. 1:M 05 May 2022 00:46:24.427 # Failed opening the RDB file dump.rdb (in server root dir /data) for saving: Permission denied 1:M 05 May 2022 00:46:24.427 # Error trying to save the DB, can't exit. 1:M 05 May 2022 00:46:24.427 # SIGTERM received but errors trying to shut down the server, check the logs for more information [root@c442abdbf09a Development]#
Having some outstanding issues trying to build bitnami/redis image for aarch64 from scratch. Working with @mffiedler on it.
Outstanding issue pending. Needs more work. Need to figure out and get docker.io/redis snapshotting working -OR- build a new image of bitnami/redis with ARM support.
More details here: https://github.com/cloud-bulldozer/benchmark-operator/pull/752#issuecomment-1121407466
/retest
@rsevilla87 Can we rerun the checks once again? We made a change to redis data dir for the image and want to see if that helped solve above issues.
@rsevilla87 Can we rerun the checks once again? We made a change to redis data dir for the image and want to see if that helped solve above issues.
Hey!, Seems like I can't rerun the workflow w/o any code change. You can make a commit --amend and push -f to force a new commit hash.
@rsevilla87
I pulled in upstream changes from github and that seems to trigger the test right away. However, I have a suspicion that the test will not pass as the complete changes needed to resolve snapshot issues are in master (https://github.com/cloud-bulldozer/benchmark-operator/compare/master...svetsa-rh:master), but not this branch (redis-multiarch-image) for which the PR was originally created.
In order to bypass snapshot errors,
config/manager/manager.yaml needs to be updated: From:
Also, charts/benchmark-operator/values.yaml refers to old redis and needs to be updated too: From: repository: bitnami/redis To: repository: quay.io/cloud-bulldozer/redis
@rsevilla87
I pulled in upstream changes from github and that seems to trigger the test right away. However, I have a suspicion that the test will not pass as the complete changes needed to resolve snapshot issues are in master (master...svetsa-rh:master), but not this branch (redis-multiarch-image) for which the PR was originally created.
In order to bypass snapshot errors,
config/manager/manager.yaml needs to be updated: From: - mountPath: /redis-master-data To: - mountPath: /data
Also, charts/benchmark-operator/values.yaml refers to old redis and needs to be updated too: From: repository: bitnami/redis To: repository: quay.io/cloud-bulldozer/redis
Updated redis mount dir. Tests re-triggered automatically.
Description
Point to redis multi-arch image to support arm architecture.
Fixes