Closed misterfifths closed 5 months ago
I'm afraid that there's nothing that we can do to make this work on the the GitLab Tart Executor's (but I'd love to be wrong here).
The reason being is that GitLab Runner simply does not allow its Custom executors to return anything other than BUILD_FAILURE_EXIT_CODE
and SYSTEM_FAILURE_EXIT_CODE
, this logic is hardcoded in executors/custom/command/command.go
:
func (c *command) waitForCommand() {
err := c.cmd.Wait()
eerr, ok := err.(*exec.ExitError)
if ok {
exitCode := getExitCode(eerr)
switch {
case exitCode == BuildFailureExitCode:
err = &common.BuildError{Inner: eerr, ExitCode: exitCode}
case exitCode != SystemFailureExitCode:
err = &ErrUnknownFailure{Inner: eerr, ExitCode: exitCode}
}
}
c.waitCh <- err
}
I've tried modifying the GitLab Tart Executor to pass-through the exit code from the SSH:
diff --git a/cmd/gitlab-tart-executor/main.go b/cmd/gitlab-tart-executor/main.go
index 9847e0f..1db1247 100644
--- a/cmd/gitlab-tart-executor/main.go
+++ b/cmd/gitlab-tart-executor/main.go
@@ -2,7 +2,9 @@ package main
import (
"context"
+ "errors"
"github.com/cirruslabs/gitlab-tart-executor/internal/commands"
+ "golang.org/x/crypto/ssh"
"log"
"os"
"os/signal"
@@ -35,6 +37,12 @@ func main() {
if err := commands.NewRootCmd().ExecuteContext(ctx); err != nil {
log.Println(err)
+
+ var sshExitError *ssh.ExitError
+ if errors.As(err, &sshExitError) {
+ os.Exit(sshExitError.ExitStatus())
+ }
+
os.Exit(failureExitCode)
}
}
But this simply results in ERROR: Job failed (system failure): unknown Custom executor executable exit code X; executable execution terminated with: exit status X
error and allow_failure:exit_codes
is not being evaluated since what is being thrown is not a build error, but rather an unknown error.
Oh, I had no idea it was a Gitlab-level issue. Thanks for the thorough investigation!
In our .gitlab-ci.yml, we use the
allow_failure
/exit_codes
feature to indicate certain types of CI failures that are acceptable. That works by inspecting the exit code of the build script and comparing it to the given list.However, in the case of a failure, the executor only ever exits with code 1 (or
BUILD_FAILURE_EXIT_CODE
if it is set); it ignores any actual exit code from the script run in the VM. The relevant code is here: https://github.com/cirruslabs/gitlab-tart-executor/blob/1f5b77e214bd74ff8cc59458ddb7a83a4301e8b1/cmd/gitlab-tart-executor/main.go#L27-L39.It would be great if the exit code of the build script in the VM was propagated to the exit code of the executor itself, so that we could use
allow_failure
. Naively that might involve inspecting the error in the code above to see if it's anssh.ExitError
, and passing along its exit code if so. That might be a little invasive though, since main.go doesn't import the ssh library.