Open remixtj opened 8 years ago
We had the same problem. If you set the step as a Workflow step, rather than a node step, this works.
FYI this is the "error handler" we use with continue on success set:
- name: NoMatchedNodes_Ok
project: name
loglevel: INFO
options:
reason:
required: true
value: main
description: "The value of result.reason"
sequence:
keepgoing: false
strategy: node-first
commands:
- script: |-
echo "Failure code: @option.reason@ (ignoring: NoMatchedNodes)"
test "@option.reason@" = "NoMatchedNodes"
nodeStep: true
description: Looking for failure reason to be NoMatchedNodes
description: Node filter failed to match - Do not fail
group: ErrorHandler
and called like this:
errorhandler:
jobref:
group: ErrorHandler
name: NoMatchedNodes_Ok
args: -reason ${result.reason}
keepgoingOnSuccess: true
it doesn't work if nodeStep: true
is set for the job.
I implemented a simpler error handler. I set as error handler a local script called rundeck-errorhandler.sh that is placed on the rundeck server.
The script is called in this way:
/usr/local/bin/rundeck-errorhandler.sh ${result.message}
#!/bin/bash
ERROR_MSG="$1"
ERRORS_HANDLED="(No nodes matched)"
echo $ERROR_MSG | grep -E $ERRORS_HANDLED
An example job:
<joblist>
<job>
<description></description>
<dispatch>
<excludePrecedence>true</excludePrecedence>
<keepgoing>true</keepgoing>
<rankOrder>ascending</rankOrder>
<threadcount>30</threadcount>
</dispatch>
<executionEnabled>true</executionEnabled>
<group>TEST/patch_v2</group>
<id>faaec3dc-049b-47cd-ab9b-1b916e2422a8</id>
<loglevel>INFO</loglevel>
<name>_TEMPLATE Patching TEST error handler</name>
<nodefilters>
<filter>.*</filter>
</nodefilters>
<nodesSelectedByDefault>false</nodesSelectedByDefault>
<scheduleEnabled>true</scheduleEnabled>
<sequence keepgoing='false' strategy='node-first'>
<command>
<errorhandler keepgoingOnSuccess='true'>
<node-step-plugin type='localexec'>
<configuration>
<entry key='command' value='/usr/local/bin/rundeck-errorhandler.sh "${result.message}"' />
</configuration>
</node-step-plugin>
</errorhandler>
<jobref group='TEST/patch_v2' name='Before - Physical Machine' nodeStep='true'>
<nodefilters>
<filter>tags: is_virtual=false name: ${node.hostname}</filter>
</nodefilters>
</jobref>
</command>
</sequence>
<uuid>faaec3dc-049b-47cd-ab9b-1b916e2422a8</uuid>
</job>
</joblist>
With this job we want to execute the step "Before - Physical Machine" if and only if the given host is a physical machine (the tag "is_virtual=false"). If is a virtual machine (is_virtual=true) should fail due to filter with NoMatchedNodes and skip to the next step due to the errorhandler with keepgoingOnSuccess set to 'true'.
I created a job with a job reference inside. On the job reference i overridden the filter with another. The filter i entered returns an empty set of nodes. I added a simple error handler to catch the error and print the error reason. In the anvils demo the filter i entered is tags: www+db which returns an empty set. The error handler simply does echo ${result.reason}.
Expected result:
echo ${result.reason} output is NoMatchedNodes
Obtained result:
echo ${result.reason} output Unknown
Samples
Countercheck
I did also a test to check if the my usage of ${result.reason} variable is correct. the error handler remained the same and inserted a correct filter on the called job. The called job executes an exit 1, so is an always failing job. In this case the value of ${result.reason} correctly becomes JobFailed.
Countercheck sample output:
Remote command failed with exit status 1 Failed: NonZeroResultCode: Remote command failed with exit status 1 Failed: JobFailed: Job [TEST/failing job] failed JobFailed Execution failed: 6: [Workflow result: , step failures: {1=Dispatch failed on 1 nodes: [app1.anvils.com: JobFailed: Job [TEST/failing job] failed]}, Node failures: {app1.anvils.com=[JobFailed: Job [TEST/failing job] failed]}, flow control: Continue, status: failed]
Sample jobs
These are the jobs i created on anvils-demo. You can import and run immediately to check the issue.
fc5429cb-5ec9-4f49-bd37-30a140be6a92.yaml.txt 896799cb-1990-42f2-b17c-fb762f4b1f0a.yaml.txt