moby / moby

The Moby Project - a collaborative project for the container ecosystem to assemble container-based systems
https://mobyproject.org/
Apache License 2.0
68.57k stars 18.64k forks source link

Cannot build emacs using Dockerfile #22801

Closed Silex closed 5 years ago

Silex commented 8 years ago

Hello,

I'm building https://github.com/Silex/docker-emacs/blob/master/24.5/Dockerfile on ubuntu 16.04 with docker version:

Client:
 Version:      1.11.1
 API version:  1.23
 Go version:   go1.5.4
 Git commit:   5604cbe
 Built:        Tue Apr 26 23:43:49 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.11.1
 API version:  1.23
 Go version:   go1.5.4
 Git commit:   5604cbe
 Built:        Tue Apr 26 23:43:49 2016
 OS/Arch:      linux/amd64

But it fails like so:

Dumping under the name emacs
**************************************************
Warning: Your system has a gap between BSS and the
heap (4301663 bytes).  This usually means that exec-shield
or something similar is in effect.  The dump may
fail because of this.  See the section about
exec-shield in etc/PROBLEMS for more information.
**************************************************
/bin/bash: line 7:  7052 Segmentation fault      (core dumped) ./temacs --batch --load loadup bootstrap
make[1]: *** [bootstrap-emacs] Error 1
make[1]: Leaving directory `/root/emacs-24.5/src'
make: *** [src] Error 2

I discovered two workarounds:

  1. Don't build with a Dockerfile and build in a running container that has a seccomp profile that allows the personality syscall.
  2. Disable /proc/sys/kernel/randomize_va_space before building

Related issues:

https://github.com/docker/docker/issues/20550 https://github.com/docker/docker/issues/22296 https://github.com/docker/docker/issues/22304

Questions:

Silex commented 8 years ago

Link to emacs bug report, but unlikely to be fixed in emacs itself: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=23529

thaJeztah commented 8 years ago

Looks to be still an issue in current master;

docker version
Client:
 Version:      1.12.0-dev
 API version:  1.24
 Go version:   go1.5.4
 Git commit:   1691fe6
 Built:        Thu May 19 11:33:17 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.12.0-dev
 API version:  1.24
 Go version:   go1.5.4
 Git commit:   1691fe6
 Built:        Thu May 19 11:33:17 2016
 OS/Arch:      linux/amd64
Dumping under the name emacs
**************************************************
Warning: Your system has a gap between BSS and the
heap (18613087 bytes).  This usually means that exec-shield
or something similar is in effect.  The dump may
fail because of this.  See the section about
exec-shield in etc/PROBLEMS for more information.
**************************************************
/bin/bash: line 7:  7638 Segmentation fault      (core dumped) ./temacs --batch --load loadup bootstrap
make[1]: *** [bootstrap-emacs] Error 1
make[1]: Leaving directory `/tmp/tmp.e9hsxFnYRl/emacs-24.5/src'
make: *** [src] Error 2
The command '/bin/sh -c TMP_DIR=$(mktemp -d) &&     curl -sSL -o $TMP_DIR/emacs.tar.xz http://ftpmirror.gnu.org/emacs/emacs-$EMACS_VERSION.tar.xz &&     tar -xJ -C $TMP_DIR -f $TMP_DIR/emacs.tar.xz &&     cd $TMP_DIR/emacs-$EMACS_VERSION &&     ./configure &&     make -j 8 install &&     rm -rf $TMP_DIR' returned a non-zero code: 2

ping @justincormack any idea if this is something we can change?

justincormack commented 8 years ago

I will look at which flags for personality it uses, we already allow the most common use cases.

justincormack commented 8 years ago

It is using personality(0x40008) which is ADDR_NO_RANDOMIZE | PER_LINUX32 which disables ASLR (and forces 32 bit). I am not sure about allowing this though, it means anyone can just disable ASLR, which significantly reduces security.

Silex commented 8 years ago

Just copying here the reasons why emacs needs to disable ASLR:

Some background: Emacs has an 'undump' function that saves the Emacs state as an executable: when you run the executable, you get an Emacs with the same (or nearly the same) state. This makes Emacs startup considerably faster. Objects in the restored state must be in the same location as when they were saved, so the executable cannot be subject to ASLR.

Building emacs used to work in previous docker versions... I understand that maybe this was a mistake and now the mistake was "fixed". For now it's always possible to build, but on a system that has ASLR disabled. Maybe the PR that allows to run privileged build command would be a good compromise on this issue.

It'd be nice if it was possible to find a way to fix this in emacs instead, but I guess it'll be too much work. I'm worried that docker hub might refuse to build images automatically when it's updated to docker 1.11.1, but on docker hub ASLR will probably be disabled.

Silex commented 8 years ago

Another interesting bit:

I don't know all the ins and outs of why it is necessary for Emacs to invoke 'personality'. As I understand it, the build procedure should invoke the shell command 'setfattr -n user.pax.flags -v er temacs' immediately after building temacs, and I don't know why this doesn't make the 'personality' call unnecessary. Perhaps you can consult a seccomp expert who can tell you what's going on, as seccomp is not well-documented. If there is some way to disable ASLR without calling 'personality', that should fix your problem.

I wonder if there is a way to tell GCC to disable ASLR for this binary, when compiling? I'm not familiar with ASLR enough.

justincormack commented 8 years ago

@Silex that setfattr -n user.pax.flags would be only for a PaX setup, which is not that common.

For some more comments on the horrible emacs build process see http://www.openwall.com/lists/musl/2015/02/03/1 which documents the issues building it on Musl libc.

As far as I can see the options are:

  1. Do nothing. If you want to build emacs you will have to disable ASLR globally on your machine or for all docker processes, eg globally with docker run --privileged --pid=host alpine sh -c "echo 0 > /proc/sys/kernel/randomize_va_space"
  2. Allow disabling ASLR via personality for all processes, which weakens security
  3. Allow disabling ASLR only on the seccomp profile for build, but not run, just to allow emacs to build.
  4. Allow setting security opts at build time so you can change/disable the seccomp filters for build. This will not allow builds on for example Docker Hub unless they expose these flags, which would be fairly unlikely.

Need to think about which option is best.

cc @jfrazelle

royseto commented 8 years ago

This broke my automated build on Docker Hub that had previously been working. My Dockerfile is based on ubuntu:14.04 and I build emacs 24.5 (among lots of other things) in it.

Docker Hub build log: https://hub.docker.com/r/royseto/devbase/builds/bediz7q7jfk2hizkmzmdord/ Dockerfile: https://github.com/royseto/devbase

Building outside Docker Hub, I was able to work around this on the host (Debian Jessie) by doing echo 0 > /proc/sys/kernel/randomize_va_space before calling docker build. But I would really like my automated build in Docker Hub to work too.

Silex commented 8 years ago

Yeah, I agree the main issue here is Docker Hub... of course I guess we could script our way into having the image built on another box and then pushed to Docker Hub but that sounds a bit silly.

I think this would be nicely solved with a RUN_PRIVILEGED Dockerfile directive or whatever was the proposal... but I understand that for Docker Hub the implications are complicated, it'd mean anyone building images on Docker Hub could basically run root commands on the server.

My guess is that this is currently not solvable on the Docker side. On the Emacs side there is a plan to refactor all this, but it won't be before a (long) while.

royseto commented 8 years ago

I worked around this by packaging up emacs 24.5.2 (built for Ubuntu 14.04 on amd64) into an adhoc Debian package and sticking it in S3 here:

https://s3-us-west-1.amazonaws.com/royseto-public/dpkg/emacs24.5.2_24.5.2-1_amd64.deb

Then I wget and dpkg -i that in my Dockerfile and keep going with the rest of my Docker build.

This works for me, but it breaks some of the transparency of the Docker Hub automated build.

fommil commented 8 years ago

I am seeing the same thing, and this is a very recent change on docker hub.

Linking to all my reports of this

fommil commented 8 years ago

FYI we need to have multiple versions of emacs because we are testing against multiple emacs in our CI.

justincormack commented 8 years ago

@fommil Docker Hub recently updated to a more recent version of Docker for builds I believe, and also a newer base distribution, with seccomp enabled, so that is when the change would be from.

I really can't see a RUN_PRIVILEGED Dockerfile command being accepted, as you would need to know that this was required before running the whole Dockerfile, as once you have started without privilege you cannot escalate.

I also can't see any service which operates on the internet willingly disabling basic protections like ASLR, so I think it is unlikely that Docker Hub would want to support this as an option.

fommil commented 8 years ago

:disappointed: this means we can't support matrix builds for emacs versions, unless upstream emacs make this step of their compile optional.

justincormack commented 8 years ago

@fommil many people have asked for this misfeature to be optional for a long time, and support for it was nearly removed from glibc at one point, see https://lwn.net/Articles/673724/

cpitclaudel commented 8 years ago

@fommil

:disappointed: this means we can't support matrix builds for emacs versions, unless upstream emacs make this step of their compile optional.

This step is optional AFAICT: env CANNOT_DUMP=yes ./configure disables dumping. The resulting Emacs can only be used in batch mode, though. Does that not work on docker?

Silex commented 8 years ago

@cpitclaudel: interesting... I'm testing at the moment.

Today, someone at https://debbugs.gnu.org/cgi/bugreport.cgi?bug=23529 told this would become more of a priority, so I guess somewhat soonish we'll have a usable buildable emacs in a container again (emacs in batch mode is not very useful for most people).

cpitclaudel commented 8 years ago

(emacs in batch mode is not very useful for most people).

More generally, an undumped emacs isn't useful for most people :) For running CI tests, OTOH, an undumped --batch-only Emacs seems reasonable.

Silex commented 8 years ago

Ok, I confirm that using CANNOT_DUMP=yes works, but the resulting emacs is not very useful as said already.

ninrod commented 8 years ago

I'm also trying to build emacs from source inside a docker container and cannot do it anymore.

cpitclaudel commented 8 years ago

@ninrod Did you try the solution that I suggested above?

ninrod commented 8 years ago

@cpitclaudel I did not because I don't want to run emacs in batch mode. I use emacs for coding and it has to be very fast.

Only way I'm seeing to get this working is to forget about my Dockerfile build and start my base container with docker run --privileged sh -c "echo 0 > /proc/sys/kernel/randomize_va_space", issue commands through docker exec and commit the final container with docker commit.

(I just discovered that a Dockerfile build has no access to overwrite /proc/sys/kernel/randomize_va_space)

I've just started to scratch this solution in work but had to go home. I think I'll finish by the end of this week as this is a side project.

justincormack commented 8 years ago

We are working on an easier workaround too, will update here soon.

On 14 Sep 2016 3:16 a.m., "Filipe Silva" notifications@github.com wrote:

@cpitclaudel https://github.com/cpitclaudel I did not because I don't want to run emacs in batch mode. I use emacs for coding and it has to be very fast.

Only way I'm seeing to get this working is to forget about my Dockerfile build and start my base container with docker run --privileged sh -c "echo 0 > /proc/sys/kernel/randomize_va_space", issue commands through docker exec and commit the final container with docker commit.

I started to scratch this solution in work but had to go home. I think I'll finish by the end of this week as this is a side project.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/docker/docker/issues/22801#issuecomment-246885579, or mute the thread https://github.com/notifications/unsubscribe-auth/AAdcPNLnO8tHPxyeQwhkeBPWZWfZn5kpks5qp1jggaJpZM4IhA1- .

Silex commented 8 years ago

For information, there's quite a discussion going on at https://debbugs.gnu.org/cgi/bugreport.cgi?bug=23529

The current idea is to simply remove the undump feature in Emacs and replace it with something more straightforward which should not require special privileges.

Silex commented 7 years ago

For information, travis use a somewhat "old" docker and is thus able to build emacs. I use it to build Emacs images at https://hub.docker.com/r/silex/emacs

When travis's docker gets updated to newest versions, one has the possibility of using sudo: true VMs and disable /proc/sys/kernel/randomize_va_space manually before building.

zw963 commented 7 years ago

Still not worked for me.

Dumping under the name emacs
**************************************************
Warning: Your system has a gap between BSS and the
heap (25842296 bytes).  This usually means that exec-shield
or something similar is in effect.  The dump may
fail because of this.  See the section about
exec-shield in etc/PROBLEMS for more information.
**************************************************
/bin/bash: line 7: 10013 Segmentation fault      ./temacs --batch --load loadup bootstrap
Makefile:815: recipe for target 'bootstrap-emacs' failed
make[1]: *** [bootstrap-emacs] Error 1
make[1]: Leaving directory '/data/emacs-24.5/src'
Makefile:387: recipe for target 'src' failed
make: *** [src] Error 2

Docker version 1.13.1, build 092cba3 Official Debian 8 image

Silex commented 7 years ago

@zw963: of course, it's still not fixed in Emacs. It's going slow, follow the ML for more informations.

ninrod commented 7 years ago

Gentleman,

I build emacs consistently on centos7 which comes with ASLR by default. Here's the scoop.

docker run -it -d \
  --name $CONTAINER_NAME \
  --privileged \
  --entrypoint='' \
  --pid=host \
  $BASE_IMG_NAME bash -c "while true; do sleep 1; done"
docker exec $CONTAINER_NAME bash -c "echo 0 > /proc/sys/kernel/randomize_va_space"
docker cp $(readlink -f assets) $CONTAINER_NAME:$DST_PATH
time docker exec $CONTAINER_NAME bash -c "$DST_PATH/path/to/your/script/provision-emacs.sh"
docker stop $CONTAINER_NAME
docker commit $CONTAINER_NAME $FINAL_IMAGE_NAME:$VERSION
Silex commented 7 years ago

@ninrod: in effect this is the same as disabling ASLR before building... only with more steps. Also it looks like you don't restore ASLR in the end so your machine now has ASLR disabled.

This is much simpler:

echo 0 > /proc/sys/kernel/randomize_va_space
docker build -t emacs .
echo 2 > /proc/sys/kernel/randomize_va_space
ninrod commented 7 years ago

@Silex I don't think I follow. I build emacs inside the docker container that I'm spinning up. That container has to be a centos image which has ASLR enabled by default. I thought I was turning off ASLR just for the container, not for my host.

So, if I understand correctly, you are turning off ASLR from the host, building the image, and then turning it up again for the host Is that right?

So if I turn off ASLR from the host system, does the container also gets ASLR turned off? That seems rather strange.

Suppose my host is an alpine system with no ASLR, and then I spin up a centos image, which has ASLR turned on by default. Would your solution also work in this case?

I think I'm confused because I don't understand the interplay between host and container regarding ASLR. Could you explain please?

Silex commented 7 years ago

Well no, you are turning it for the host:

philippe@pv-desktop:~$ cat /proc/sys/kernel/randomize_va_space
2

philippe@pv-desktop:~$ docker run -it --rm --privileged --pid=host centos bash -c "echo 0 > /proc/sys/kernel/randomize_va_space"

philippe@pv-desktop:~$ cat /proc/sys/kernel/randomize_va_space
0

That's what --pid=host does.

AFAIK docker run -it --rm --privileged --pid=host centos COMMAND is basically a complicated way to do sudo COMMAND :wink:

Anyway, even if it worked requiring privileged containers is not a workaround. The docker hub will likely never build with privileged containers.

Also, someone correct me if I'm wrong but ASLR will always be shared between the host and the containers... given the containers share the same kernel.

ninrod commented 7 years ago

@Silex, thanks!

I still have to use the pid=host hack though, because I mount the docker client from moby linux (which docker for windows employs as the linux vm for the docker daemon host). I can't just echo 0 > blah because the address is read only. Magically, with this hack:

docker run -it --rm --privileged --pid=host alpine sh -c "echo 0 > /proc/sys/kernel/randomize_va_space"

then it works. go figure.

it's the same, but not really the same in docker for windows.

Silex commented 5 years ago

Emacs 27 will be able to build inside docker (currently it's the master branch).

For previous versions, you can use https://hub.docker.com/r/silex/emacs or disable /proc/sys/kernel/randomize_va_space like mentionned above.

thaJeztah commented 5 years ago

Thanks for the update @Silex, that's great news

ninrod commented 5 years ago

awesome news indeed