ClusterLabs / resource-agents

Combined repository of OCF agents from the RHCS and Linux-HA projects
GNU General Public License v2.0
493 stars 582 forks source link

docker-compose: use "docker compose" when not using older docker-compose command #1975

Closed oalbrigt closed 2 months ago

oalbrigt commented 2 months ago

Fixes https://github.com/ClusterLabs/resource-agents/issues/1974

minal3 commented 2 months ago

I saw your commit; I guess this issue is resolved?

oalbrigt commented 2 months ago

retest this please

oalbrigt commented 2 months ago

I saw your commit; I guess this issue is resolved? The updated patch should fix it.

It adds " compose" to the command when "docker-compose version 1." isnt output when running the command with -v.

minal3 commented 2 months ago

Yes I have seen the conditional statement: I will run the patched RA at my test cluster and test.

oalbrigt commented 2 months ago

I updated the patch to use $COMMAND -v, as I realized $OCF_RESKEY_binpath might not always be the command to use.

oalbrigt commented 2 months ago

retest this please

minal3 commented 2 months ago

Unfortunately this fix fails. I created two resources on a test cluster:

murat@node1:~$ sudo pcs resource
  * res10_vip   (ocf:heartbeat:IPaddr2):         Started node1.dc.lab
  * res20_httpd (ocf:heartbeat:docker-compose):  Stopped

ocf:heartbeat:docker-compose fails as;

Failed Resource Actions:
  * res20_httpd start on node1.dc.lab returned 'not installed' (Setup problem: couldn't find command: /usr/bin/docker-compose) at Mon Sep 16 20:56:34 2024 after 30ms

I extracted ONLY the executing lines until right after the fix:

#!/bin/sh
##########################################################################
# Initialization:

: ${OCF_ROOT:=/usr/lib/ocf}
: ${OCF_FUNCTIONS_DIR=${OCF_ROOT}/lib/heartbeat}
#. ${OCF_FUNCTIONS_DIR}/ocf-shellfuncs

# Defaults
OCF_RESKEY_binpath_default=/usr/bin/docker-compose
OCF_RESKEY_ymlfile_default=docker-compose.yml
: ${OCF_RESKEY_binpath=${OCF_RESKEY_binpath_default}}
: ${OCF_RESKEY_ymlfile=${OCF_RESKEY_ymlfile_default}}

if [ -r "$OCF_RESKEY_binpath" -a -x "$OCF_RESKEY_binpath" ]; then
    COMMAND="$OCF_RESKEY_binpath"
else
    COMMAND="$OCF_RESKEY_binpath_default"
fi

#$COMMAND -v | grep -q "^docker-compose version 1\." || COMMAND="$COMMAND compose"
$COMMAND -v | grep -q "^docker-compose version 1\." || COMMAND="${COMMAND%%-compose} compose"
echo "$COMMAND"

Penultimate line is my modification. Just execute them, simple enough. HOWEVER, resource fails with my modification, issuing the same error;

Failed Resource Actions:
  * res20_httpd start on node1.dc.lab returned 'not installed' (Setup problem: couldn't find command: /usr/bin/docker-compose) at Mon Sep 16 20:56:34 2024 after 30ms

This means RA execution already complains about the NON-EXISTENT docker-compose BEFORE the fix. I ran ocf-tester, it got angry;

murat@node1:~$ sudo ocf-tester -n ocf:heartbeat:docker-compose /usr/lib/ocf/resource.d/heartbeat/docker-compose 
Beginning tests for /usr/lib/ocf/resource.d/heartbeat/docker-compose...
/usr/lib/ocf/resource.d/heartbeat/docker-compose: 119: /usr/bin/docker-compose: not found
/usr/lib/ocf/resource.d/heartbeat/docker-compose: 119: /usr/bin/docker-compose: not found
* rc=5: Validation failed.  Did you supply enough options with -o ?
/usr/lib/ocf/resource.d/heartbeat/docker-compose: 119: /usr/bin/docker-compose: not found
ocf-exit-reason:Setup problem: couldn't find command: /usr/bin/docker-compose
Aborting tests
oalbrigt commented 2 months ago

I have one more fix to push before you can do further testing.

oalbrigt commented 2 months ago

I think I managed to cover all the corner-cases now.

oalbrigt commented 2 months ago

I added an update to also avoid error being output if the command doesnt exist.

minal3 commented 2 months ago

Hello, Patch is working fine now. First I ran it against ocf-tester;

root@node1: ~
> ocf-tester -n ocf:heartbeat:docker-compose -o dirpath=/opt/dockerfiles/revproxy-nginx -o ymlfile=docker-compose.yaml /usr/lib/ocf/resource.d/heartbeat/docker-compose 
Beginning tests for /usr/lib/ocf/resource.d/heartbeat/docker-compose...
* Your agent does not support the notify action (optional)
* Your agent does not support the demote action (optional)
* Your agent does not support the promote action (optional)
* Your agent does not support promotable clones (optional)
* Your agent does not support the reload action (optional)
/usr/lib/ocf/resource.d/heartbeat/docker-compose passed all tests

My test cluster includes 4 nodes, one being a qdevice. All non-witness nodes have an nginx docker container image & the same docker-compose.yaml file. All nodes have ocf:heartbeat:docker-compose patched with your latest fix.

Result: I moved

Resource: res20_httpd (class=ocf provider=heartbeat type=docker-compose)

among all three. RA is executing docker-compose as expected. No errors, nginx container is running at any node, I was able to access default web page over VIP address.

minal3 commented 2 months ago

Two items to discuss;

  1. I have NOT done a test with docker-compose v1.29, after all this thorough work, it should be done.

  2. Default value of attribute "ymlfile" is docker-compose.yml, hence outdated. This is from official Compose documentation;

The default path for a Compose file is compose.yaml (preferred) or compose.yml that is placed in the working directory. Compose also supports docker-compose.yaml and docker-compose.yml for backwards compatibility of earlier versions. If both files exist, Compose prefers the canonical compose.yaml.

This is the resource definition at my test cluster;

Resource: res20_httpd (class=ocf provider=heartbeat type=docker-compose) Attributes: res20_httpd-instance_attributes dirpath=/opt/dockerfiles/revproxy-nginx/ ymlfile=docker-compose.yaml Operations: monitor: res20_httpd-monitor-interval-60s interval=60s timeout=10s start: res20_httpd-start-interval-0s interval=0s timeout=240s stop: res20_httpd-stop-interval-0s interval=0s timeout=20s

I deleted the attribute ymlfile and resource failed, it should be defaulting to .yml extension. Do you think the default value should be updated to .yaml?

Or an existence test: [ -a docker-compose.yaml ] || [ -a docker-compose.yml ]

oalbrigt commented 2 months ago

Nice to hear the patch is working.

Regarding .yml vs .yaml that's just a user-preference, and it's why there is a parameter for setting it if your file has a different name than the agent expects by default, and also if you have multiple yaml files in the directory.