Closed oalbrigt closed 2 months ago
I saw your commit; I guess this issue is resolved?
retest this please
I saw your commit; I guess this issue is resolved? The updated patch should fix it.
It adds " compose" to the command when "docker-compose version 1." isnt output when running the command with -v.
Yes I have seen the conditional statement: I will run the patched RA at my test cluster and test.
I updated the patch to use $COMMAND -v, as I realized $OCF_RESKEY_binpath might not always be the command to use.
retest this please
Unfortunately this fix fails. I created two resources on a test cluster:
murat@node1:~$ sudo pcs resource
* res10_vip (ocf:heartbeat:IPaddr2): Started node1.dc.lab
* res20_httpd (ocf:heartbeat:docker-compose): Stopped
ocf:heartbeat:docker-compose fails as;
Failed Resource Actions:
* res20_httpd start on node1.dc.lab returned 'not installed' (Setup problem: couldn't find command: /usr/bin/docker-compose) at Mon Sep 16 20:56:34 2024 after 30ms
I extracted ONLY the executing lines until right after the fix:
#!/bin/sh
##########################################################################
# Initialization:
: ${OCF_ROOT:=/usr/lib/ocf}
: ${OCF_FUNCTIONS_DIR=${OCF_ROOT}/lib/heartbeat}
#. ${OCF_FUNCTIONS_DIR}/ocf-shellfuncs
# Defaults
OCF_RESKEY_binpath_default=/usr/bin/docker-compose
OCF_RESKEY_ymlfile_default=docker-compose.yml
: ${OCF_RESKEY_binpath=${OCF_RESKEY_binpath_default}}
: ${OCF_RESKEY_ymlfile=${OCF_RESKEY_ymlfile_default}}
if [ -r "$OCF_RESKEY_binpath" -a -x "$OCF_RESKEY_binpath" ]; then
COMMAND="$OCF_RESKEY_binpath"
else
COMMAND="$OCF_RESKEY_binpath_default"
fi
#$COMMAND -v | grep -q "^docker-compose version 1\." || COMMAND="$COMMAND compose"
$COMMAND -v | grep -q "^docker-compose version 1\." || COMMAND="${COMMAND%%-compose} compose"
echo "$COMMAND"
Penultimate line is my modification. Just execute them, simple enough. HOWEVER, resource fails with my modification, issuing the same error;
Failed Resource Actions:
* res20_httpd start on node1.dc.lab returned 'not installed' (Setup problem: couldn't find command: /usr/bin/docker-compose) at Mon Sep 16 20:56:34 2024 after 30ms
This means RA execution already complains about the NON-EXISTENT docker-compose BEFORE the fix. I ran ocf-tester, it got angry;
murat@node1:~$ sudo ocf-tester -n ocf:heartbeat:docker-compose /usr/lib/ocf/resource.d/heartbeat/docker-compose
Beginning tests for /usr/lib/ocf/resource.d/heartbeat/docker-compose...
/usr/lib/ocf/resource.d/heartbeat/docker-compose: 119: /usr/bin/docker-compose: not found
/usr/lib/ocf/resource.d/heartbeat/docker-compose: 119: /usr/bin/docker-compose: not found
* rc=5: Validation failed. Did you supply enough options with -o ?
/usr/lib/ocf/resource.d/heartbeat/docker-compose: 119: /usr/bin/docker-compose: not found
ocf-exit-reason:Setup problem: couldn't find command: /usr/bin/docker-compose
Aborting tests
I have one more fix to push before you can do further testing.
I think I managed to cover all the corner-cases now.
I added an update to also avoid error being output if the command doesnt exist.
Hello, Patch is working fine now. First I ran it against ocf-tester;
root@node1: ~
> ocf-tester -n ocf:heartbeat:docker-compose -o dirpath=/opt/dockerfiles/revproxy-nginx -o ymlfile=docker-compose.yaml /usr/lib/ocf/resource.d/heartbeat/docker-compose
Beginning tests for /usr/lib/ocf/resource.d/heartbeat/docker-compose...
* Your agent does not support the notify action (optional)
* Your agent does not support the demote action (optional)
* Your agent does not support the promote action (optional)
* Your agent does not support promotable clones (optional)
* Your agent does not support the reload action (optional)
/usr/lib/ocf/resource.d/heartbeat/docker-compose passed all tests
My test cluster includes 4 nodes, one being a qdevice. All non-witness nodes have an nginx docker container image & the same docker-compose.yaml file. All nodes have ocf:heartbeat:docker-compose patched with your latest fix.
Result: I moved
Resource: res20_httpd (class=ocf provider=heartbeat type=docker-compose)
among all three. RA is executing docker-compose as expected. No errors, nginx container is running at any node, I was able to access default web page over VIP address.
Two items to discuss;
I have NOT done a test with docker-compose v1.29, after all this thorough work, it should be done.
Default value of attribute "ymlfile" is docker-compose.yml, hence outdated. This is from official Compose documentation;
The default path for a Compose file is compose.yaml (preferred) or compose.yml that is placed in the working directory. Compose also supports docker-compose.yaml and docker-compose.yml for backwards compatibility of earlier versions. If both files exist, Compose prefers the canonical compose.yaml.
This is the resource definition at my test cluster;
Resource: res20_httpd (class=ocf provider=heartbeat type=docker-compose) Attributes: res20_httpd-instance_attributes dirpath=/opt/dockerfiles/revproxy-nginx/ ymlfile=docker-compose.yaml Operations: monitor: res20_httpd-monitor-interval-60s interval=60s timeout=10s start: res20_httpd-start-interval-0s interval=0s timeout=240s stop: res20_httpd-stop-interval-0s interval=0s timeout=20s
I deleted the attribute ymlfile
and resource failed, it should be defaulting to .yml
extension. Do you think the default value should be updated to .yaml
?
Or an existence test: [ -a docker-compose.yaml ] || [ -a docker-compose.yml ]
Nice to hear the patch is working.
Regarding .yml vs .yaml that's just a user-preference, and it's why there is a parameter for setting it if your file has a different name than the agent expects by default, and also if you have multiple yaml files in the directory.
Fixes https://github.com/ClusterLabs/resource-agents/issues/1974