sbt / sbt-native-packager

sbt Native Packager
https://sbt-native-packager.readthedocs.io/en/stable/
BSD 2-Clause "Simplified" License
1.6k stars 441 forks source link

Error 139 on Docker version 1.39 #1346

Closed damienmarshall closed 4 years ago

damienmarshall commented 4 years ago

When executing sbt --debug docker:publish on Kubernetes version 1.14 in GKE (which enforces docker version 18.09) we're seeing the following error on the latest version:

[debug] Forcing garbage collection...
java.lang.RuntimeException: Nonzero exit value: 139
    at com.typesafe.sbt.packager.docker.DockerPlugin$.publishLocalDocker(DockerPlugin.scala:622)
    at com.typesafe.sbt.packager.docker.DockerPlugin$$anonfun$projectSettings$28.apply(DockerPlugin.scala:244)
    at com.typesafe.sbt.packager.docker.DockerPlugin$$anonfun$projectSettings$28.apply(DockerPlugin.scala:242)
    at scala.Function1$$anonfun$compose$1.apply(Function1.scala:47)
    at sbt.$tilde$greater$$anonfun$$u2219$1.apply(TypeFunctions.scala:40)
    at sbt.std.Transform$$anon$4.work(System.scala:63)
    at sbt.Execute$$anonfun$submit$1$$anonfun$apply$1.apply(Execute.scala:228)
    at sbt.Execute$$anonfun$submit$1$$anonfun$apply$1.apply(Execute.scala:228)
    at sbt.ErrorHandling$.wideConvert(ErrorHandling.scala:17)
    at sbt.Execute.work(Execute.scala:237)
    at sbt.Execute$$anonfun$submit$1.apply(Execute.scala:228)
    at sbt.Execute$$anonfun$submit$1.apply(Execute.scala:228)
    at sbt.ConcurrentRestrictions$$anon$4$$anonfun$1.apply(ConcurrentRestrictions.scala:159)
    at sbt.CompletionService$$anon$2.call(CompletionService.scala:28)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
[error] (root/docker:publishLocal) Nonzero exit value: 139

We've tried a few different optoins to no avail! This doesn't happen on an earlier version of Kubernetes in GKE on Docker version 17.03.2-ce.

Any advice on how to proceed with helping to debug whats happening or has anyone seen this issue before?

muuki88 commented 4 years ago

Thanks @damienmarshall for the bug report :smile: . It would be helpful to fill out the default issue template to get a better understanding.

Having that said. Is there any output before this exception? sbt native packager uses the native docker binary so you can reproduce step by step what the plugin does.

  1. sbt docker:stage
  2. go to target/docker/stage
  3. execute what sbt "show dockerBuildCommand" returns, e.g. something like docker build --force-rm .

This should trigger the same error and give you more logs if they are not in sbt itself.

damienmarshall commented 4 years ago

Thanks for the feedback @muuki88, sorry for the missing details. We tried version 1.7.3 with the same issue (was happening also with 1.1.5).

Its a little bit tricky for us to recreate this manually as its happening within GKE itself. We did run sbt with debug logs in Jenkins and seen the following:

[success] All package validations passed
[debug] Executing Native docker build --force-rm -t <redacted_name>.
[debug] Working directory /home/jenkins/agent/workspace/<redacted_name>/target/docker/stage
[debug] Forcing garbage collection...
java.lang.RuntimeException: Nonzero exit value: 139

So the native docker command being run is

docker build --force-rm -t <redacted_name> .

We will keep digging here to see if can recreate it too, any advice on extra flags or the like we can set to help determine the issue is very much appreciated!

Thank you again for your help!

Damien

muuki88 commented 4 years ago

That super good to know that this issue both hits 1.1.5 and 1.7.3, because a lot of work as been done on the docker plugin between those releases. So it's probably a general issue with the Dockerfile that is being created.

Speaking of the Dockerfile. It's located under target/docker/stage/Dockerfile. You could use this to check for errors.

I google for error code 139 and there seem to be quite some issues

damienmarshall commented 4 years ago

Ok, looks like this is an environment issue. In gke 1.13 it appears that the ability to run the host docker executable from within a docker container has been broken, which causes the segfault: https://laupow.github.io/2019/05/gke-v1.13-upgrade/

I was able to get a partial successful build by having the docker binary in the container itself, so this isn't a library issue/

Thank you for your help @muuki88 ! Appreciate it.