Woodpecker pipeline script ignores exit code from subshell

svdHero commented 1 year ago

Component

other

Describe the bug

I have the following .woodpecker.yml file:

pipeline:
  my-step:
    image: debian:bullseye-slim
    commands:
      - echo "Do one thing."
      - (ls /this-directory-does-not-exist; exit 0)
      - echo "Do something else."

The pipeline run fails with error message

ls: cannot access '/this-directory-does-not-exist': No such file or directory

which I don't understand, since the subshell (...) exits with status code 0.

Other CI systems, such as Jenkins or Azure DevOps, honor subshell exit status codes. Am I doing something wrong? Are subshells not supported by Woodpecker?

System Info

{
    "source":"https://github.com/woodpecker-ci/woodpecker",
    "version":"next-86e49851"
}

Additional context

Validations

[X] Read the Contributing Guidelines.
[X] Read the docs.
[X] Check that there isn't already an issue that reports the same bug to avoid creating a duplicate.
[X] Check that this is a concrete bug. For Q&A join our Discord Chat Server or the Matrix room.

gapodo commented 1 year ago

Can't comment on the subshell, as I've never had any problems with it (maybe that's a comment already).

Re your use of a subshell, from a performance perspective I'd rather use ls /this-directory-does-not-exist || exit 0 instead of the subshell (one less process spawned, one less clone of the env,...) which should also be more resilient with respect to different shells (if I remember correctly pipelines use /bin/sh, which varies based on the image used and is not always the same as the default shell of the distro or image (images sometimes diverge from the distro standard), which for Debian should be dash IIRC)

smainz commented 1 year ago

The reason is woodpecker calls the shell with -e. This is a simlation of what happenes using the image debian:bullseye-slim above:

root@39b55d4bda40:/# # This is how woodpecker calls the commands
cat <<EOF | /bin/sh  -e
echo + 'echo "Do one thing."'
echo "Do one thing."

echo + '(ls /this-directory-does-not-exist; exit 0)'
(ls /this-directory-does-not-exist; exit 0)

echo + 'echo "Do something else."'
echo "Do something else."
EOF
+ echo "Do one thing."
Do one thing.
+ (ls /this-directory-does-not-exist; exit 0)
ls: cannot access '/this-directory-does-not-exist': No such file or directory
root@39b55d4bda40:/#

And here it is without the -e flag:

root@39b55d4bda40:/# # The same without -e flag to sh
cat <<EOF | /bin/sh
echo + 'echo "Do one thing."'
echo "Do one thing."

echo + '(ls /this-directory-does-not-exist; exit 0)'
(ls /this-directory-does-not-exist; exit 0)

echo + 'echo "Do something else."'
echo "Do something else."
EOF
+ echo "Do one thing."
Do one thing.
+ (ls /this-directory-does-not-exist; exit 0)
ls: cannot access '/this-directory-does-not-exist': No such file or directory
+ echo "Do something else."
Do something else.
root@39b55d4bda40:/#

From the manual of sh:

-e errexit If not interactive, exit immediately if any untested command fails. The exit status of a command is considered to be explicitly tested if the command is used to control an if elif while or until or if the command is the left hand operand of an &&'' or||'' operator.

Can anyone predict what happenes, if the -e is dropped in the code?

gapodo commented 1 year ago

Can anyone predict what happenes, if the -e is dropped in the code?

This would open a huge can of worms, since we would no longer evaluate the exit state of "stuff" executed in subshells anymore, but more pressingly IIRC not failing on non 0 exit codes would mean pipelines may get executed further than the failure, which may lead to broken builds.

I've just tested a bit to verify my suspicion on the impact of the -e, if you are executing a script, it'll ignore what ever happens inside the script, as long as you exit the script with 0. So the only place where having a -e is going to bite us is when running subshells without capturing non 0 exit codes.

Subshells failing when not captured is IMHO a good default, as subshells are often used to prepare variables,... and them failing should in most cases take the entire step with them. If one knows it may fail, and it's ok, capturing with || should do the trick, if it gets more complicated (requiring more complex if else, loops,...) it's time for either a Makefile or build-script anyway.

svdHero commented 1 year ago

Thank you guys for the detailed explanations. I think the || solution is fine. I will test that on Monday and then cloee the issue.

svdHero commented 1 year ago

The || solution works for me. Thanks again for the support and keep up the great work.

woodpecker-ci / woodpecker