scylladb / scylla-cluster-tests

Tests for Scylla Clusters
GNU Affero General Public License v3.0
57 stars 95 forks source link

Why do we relay on `sh` when run docker #8721

Open dkropachev opened 1 month ago

dkropachev commented 1 month ago

We want to introduced custom scylla-bench, which is based on scratch, so there is no shell. As result run is failing with:

[2024-09-16T16:02:20.635Z] < t:2024-09-16 16:02:19,262 f:base.py         l:147  c:RemoteLibSSH2CmdRunner p:ERROR > <10.4.1.125>: Error executing command: "sudo  docker exec 4d7237fa463dccbb62a9942ba0e98ad68b25e897cb33a00181cdc1a32210a73f /bin/sh -c 'scylla-bench -workload=sequential -mode=write -replication-factor=3 -partition-count=100 -clustering-row-count=5555                       -clustering-row-size=uniform:1024..2048 -concurrency=10 -connection-count=10 -consistency-level=quorum -rows-per-request=10 -timeout=30s -validate-data -error-at-row-limit 1000 -nodes 10.4.2.113,10.4.3.237,10.4.3.250,10.4.3.53,10.4.3.32'"; Exit status: 1

[2024-09-16T16:02:20.635Z] < t:2024-09-16 16:02:19,264 f:file_logger.py  l:101  c:sdcm.sct_events.file_logger p:INFO  > 2024-09-16 16:02:19.262: (ScyllaBenchEvent Severity.CRITICAL) period_type=end event_id=70b93f59-f12e-4a6d-a0ef-8314abb8ba81 duration=0s: node=Node longevity-large-partitions-3h-maste-loader-node-a7dd38ed-1 [3.252.137.240 | 10.4.1.125]

[2024-09-16T16:02:20.635Z] < t:2024-09-16 16:02:19,264 f:file_logger.py  l:101  c:sdcm.sct_events.file_logger p:INFO  > stress_cmd=scylla-bench -workload=sequential -mode=write -replication-factor=3 -partition-count=100 -clustering-row-count=5555                       -clustering-row-size=uniform:1024..2048 -concurrency=10 -connection-count=10 -consistency-level=quorum -rows-per-request=10 -timeout=30s -validate-data -error-at-row-limit 1000 -nodes 10.4.2.113,10.4.3.237,10.4.3.250,10.4.3.53,10.4.3.32

[2024-09-16T16:02:20.635Z] < t:2024-09-16 16:02:19,264 f:file_logger.py  l:101  c:sdcm.sct_events.file_logger p:INFO  > errors:

[2024-09-16T16:02:20.635Z] < t:2024-09-16 16:02:19,264 f:file_logger.py  l:101  c:sdcm.sct_events.file_logger p:INFO  > 

[2024-09-16T16:02:20.635Z] < t:2024-09-16 16:02:19,264 f:file_logger.py  l:101  c:sdcm.sct_events.file_logger p:INFO  > Stress command completed with bad status 1: Error response from daemon: container 4d7237fa463dccbb62a9942ba0e98ad68b25e897cb33a00181cdc1a32210a73f is not running

Is there a way to stop using sh -c to run scylla-bench loaders ?

dkropachev commented 1 month ago

@vponomaryov, can you please take a look at it.

vponomaryov commented 1 month ago

@vponomaryov, can you please take a look at it.

It is standard code used for running commands in docker: https://github.com/scylladb/scylla-cluster-tests/blob/fd0a08c7/sdcm/utils/docker_remote.py#L105

So, need either update SCT not breaking all other places where docker.run gets used or update scylla-bench images to have sh.