emacs-eldev / eldev

Elisp development tool
https://emacs-eldev.github.io/eldev/
GNU General Public License v3.0
226 stars 17 forks source link

Add Command for Running Emacs in Docker #53

Closed LaurenceWarne closed 2 years ago

LaurenceWarne commented 2 years ago

Hi! I've been playing around a bit for a way to run my projects with different Emacs versions without having to have multiple Emacs installations and/or something like evm.

I thought it might be useful to take advantage of the existing eldev docker images on https://hub.docker.com/r/silex/emacs to do this. Demo:

https://user-images.githubusercontent.com/17688577/138159777-9cd027a6-5753-4b2d-b3b4-c6f333ed7e8d.mp4

Essentially the command just runs an appropriate docker image and then executes eldev emacs with the rest of the passed args. Would you be interested in this? - No problem if you think this would be more appropriate as a plugin or something.

I've left out automated testing and documentation, happy to add them (to this PR) if you're interested in this change.

doublep commented 2 years ago

In principle this looks interesting. However, I'm unable to even run it as it is:

~/datetime$ eldev emacs-docker 25 --eval "(datetime-locale-database-version)"
Pulling docker image silex/emacs:25-ci-eldev
[...]

Running docker image silex/emacs:25-ci-eldev

Output of the docker run:
Bootstrapping Eldev for Emacs 25.3 from MELPA Stable...

Importing package-keyring.gpg...
Importing package-keyring.gpg...done
Output of the child Emacs:
No protocol specified
Display :0 unavailable, simulating -nw
emacs: standard input is not a tty

Child Emacs exited with error code 1

Docker run exited with error code 1

As someone who has close to zero experience with Docker, I have no idea how to proceed from here.

I also noticed that local dependencies don't work at all, as I had ~/datetime use local dependency in ~/extmap, but Docker run immediately failed with it, since this directory was not available.

Ideally, I think, it would work as eldev emacs-docker VERSION ELDEV-COMMAND, so that instead of always running Emacs, it would let me do anything. E.g. run tests:

$ eldev emacs-docker 25.2 test

would fetch Docker image and run eldev test in it. The current behavior, as I understand it, would then easily be available with one extra word on the command line, i.e. as eldev emacs-docker VERSION emacs ....

Also, would be nice if there was a way to make local dependencies work. If this is not possible, maybe Eldev in Docker could receive a flag "ignore all local dependencies and print a warning if any are used".

LaurenceWarne commented 2 years ago

Thanks for the feedback.

As someone who has close to zero experience with Docker, I have no idea how to proceed from here.

My guess is that you haven't xhost +local:root, from looking at the docker emacs doc I think this is something we shouldn't run automatically since there are security considerations with it (probably just make a note in the doc).

I also noticed that local dependencies don't work at all, as I had ~/datetime use local dependency in ~/extmap, but Docker run immediately failed with it, since this directory was not available.

I think the last commit should cover this, I cloned datetime and extmap and tried it out with an Eldev-local of:

(eldev-use-local-dependency "~/extmap")

Seems to work ok.

Ideally, I think, it would work as eldev emacs-docker VERSION ELDEV-COMMAND, so that instead of always running Emacs, it would let me do anything.

Yeah, that makes sense. Hopefully the last commit covers this too, so the command in the example is:

eldev --trace emacs-docker 25 emacs --foreground-color=red --eval '(insert (format "Emacs version: %s\nEldev commands: %s" emacs-version (mapconcat (lambda (c) (symbol-name (cdr c))) eldev--commands ", ")))'

As an aside, is it possible to run eldev emacs -nw? I get emacs: standard input is not a tty.

doublep commented 2 years ago

From this point on it looks generally useful, so I agree to merge this into Eldev. If you wanted to just present a showcase, I can continue from here. Or if you feel like it, you can continue improving this PR yourself.

Things to improve and investigate:

As an aside, is it possible to run eldev emacs -nw? I get emacs: standard input is not a tty.

Apparently not. I don't see a way in Emacs to forward stdin to the launched process. In general, Emacs/Elisp is not a nice combination for that. E.g. I gave up on issue #25 because of related and apparently unsolvable problems.

LaurenceWarne commented 2 years ago

Nice, I've pushed a few commits adding some documentation and a couple of tests (the tests take ~20 secs each due to the docker invocations so I've held off adding more for now - any thoughts on this?) and also addressing some of the ^.

I've made the docker command customizable along with allowing a custom function to be set to determine the docker image to use from an Emacs version. Some thoughts on some of the other points:

Use Elisp's filepath handling functions instead of regexps for local dependency Docker mounts. At the very least, from Docker's documentation it appears to support Windows, and even if we don't have appropriate images now, why not have Eldev itself at least prepared for that?

Yeah I found obtaining the file paths for the mounts to be a bit tricky given you can use relative paths in Eldev.local, something like (format "%s:/root/%s" (expand-file-name local-dir) (file-relative (expand-file-name local-dir) (getenv "HOME"))) works for paths given local to a user's home but breaks for absolute paths. I think this can probably done by handling the cases differently if local-dir starts with "HOME" - I see (getenv "HOME") apparently works on windows so hopefully that would solve compatabililty stuff.

What is /tmp/.X11-unix for? Can we find the path from somewhere instead of hardcoding it? Maybe Docker has some option for that?

See https://unix.stackexchange.com/questions/196677/what-is-tmp-x11-unix. It appears to be fairly standard for X, but yeah would break with Windows.

I'm happy to continue working on some of the above. I've un-drafted this PR feel free to merge it if you'd prefer they'd be done in seperate PRs (and you're happy with the new changes + doc!) - I think this would make the most sense if you wanted to investigate some of the above in tandem. Continuing on this PR is fine by me otherwise :slightly_smiling_face: .

doublep commented 2 years ago

Have no time to review and investigate everything yet, sorry. Will reply again later, maybe tomorrow.

the tests take ~20 secs each due to the docker invocations so I've held off adding more for now - any thoughts on this?)

Move them to integration tests (we integrate with remote-provided images here). Or maybe even to their own test type, due to slowness.

I'm happy to continue working on some of the above. I've un-drafted this PR feel free to merge it if you'd prefer they'd be done in seperate PRs (and you're happy with the new changes + doc!) - I think this would make the most sense if you wanted to investigate some of the above in tandem. Continuing on this PR is fine by me otherwise slightly_smiling_face .

Thank you. Please continue on your own for now, I will tell you when I want to merge. I will request you to squash all the commits then. Also, please rebase on master, since otherwise the tests will fail anyway (master now disabled testing on Windows as always failing due to external bug).

LaurenceWarne commented 2 years ago

Have no time to review and investigate everything yet, sorry. Will reply again later, maybe tomorrow.

No problem, whenever you have the time - no rush :slightly_smiling_face:

Thank you. Please continue on your own for now, I will tell you when I want to merge. I will request you to squash all the commits then. Also, please rebase on master, since otherwise the tests will fail anyway (master now disabled testing on Windows as always failing due to external bug).

Will do.

LaurenceWarne commented 2 years ago

What's currently TODO:

QQ: Is there a standard way to install Eldev from my local git repo? - atm I'm just copying files to the relevant place in ~/.eldev/ :smile:.

doublep commented 2 years ago
  • Maybe switch to docker cp + docker exec instead of one docker run so we don't use volumes, this would stop the container making files owned by root in the caches

Yeah, I already ran into this too. However, could there be a better solution? As I understand, if we do it with cp, nothing will be cached between different Docker runs. In other words, can we let the process in Docker have write access to the caches, but use "normal" user as file creator/writer?

doublep commented 2 years ago

QQ: Is there a standard way to install Eldev from my local git repo? - atm I'm just copying files to the relevant place in ~/.eldev/

Not sure if it would be enough, but have a look at eldev--upgrade-self-from-forced-pa.

doublep commented 2 years ago

Some more thoughts:

LaurenceWarne commented 2 years ago

Yeah, I already ran into this too. However, could there be a better solution? As I understand, if we do it with cp, nothing will be cached between different Docker runs. In other words, can we let the process in Docker have write access to the caches, but use "normal" user as file creator/writer?

I created antoher image: https://github.com/LaurenceWarne/eldev/commit/aadfcd7f3c9b075e6e965821e77130c81ded39dd, it works by creating another user with the same uid on the container as the host, and running with that user on the container (inspired by http://www.inanzzz.com/index.php/post/dna6/unning-docker-container-with-a-non-root-user-and-fixing-shared-volume-permissions-with-gosu and https://github.com/daniccan/docker-gosu).

It lgtm testing, in particular it solves the permissions errors on the test runs. lmk what you think of this - I'm not sure where it would live (I don't think it should be part of the default ci-eldev img since it does a lot of stuff CI users probably shouldn't care about).

doublep commented 2 years ago

I created antoher image: LaurenceWarne@aadfcd7, it works by creating another user with the same uid on the container as the host, and running with that user on the container

Uh, is there no easier way? I also searched for this on the Internet briefly, and haven't found anything simple. But is it really not possible to go without a specialized image? This would effectively remove the option to use user's own image (i.e. what we discussed before, either VER -> silex's image, or IMAGE-NAME, for a custom one), or at least require that they base on "our" image somehow...

It lgtm testing, in particular it solves the permissions errors on the test runs. lmk what you think of this - I'm not sure where it would live (I don't think it should be part of the default ci-eldev img since it does a lot of stuff CI users probably shouldn't care about).

Yeah, I thought about negotiating with silex to include this into his images, but now I see that there are quite a few things going in it.

LaurenceWarne commented 2 years ago

Uh, is there no easier way? I also searched for this on the Internet briefly, and haven't found anything simple. But is it really not possible to go without a specialized image? This would effectively remove the option to use user's own image (i.e. what we discussed before, either VER -> silex's image, or IMAGE-NAME, for a custom one), or at least require that they base on "our" image somehow...

I've just thought of a horrific hack to get away with using the same image (or any image with an Emacs installation): the problem with using --user UID:GID on docker run is that this 'user' on the container won't have permission to run eldev (or really do anything outside of the mounted volumes), but if we pass a HOME variable to some directory on a docker volume, then we can install eldev when docker run is called, example (setting the HOME to inside the project directory):

 docker run -e HOME=/eldev/fake-home -it --rm -u 1000:1000 -v /home/laurencewarne/projects/eldev/:/eldev -w /eldev silex/emacs:27-ci-eldev sh -c 'curl -fsSL https://raw.github.com/doublep/eldev/master/webinstall/eldev | sh && export PATH="/eldev/fake-home/.eldev/bin:$PATH" && eldev test'

This appear s to work, though seems a bit fragile, and we'd have to mount the global cache and config inside this new home, etc. I don't know if I'd call this an "easier way", but wdyt?

It would be interesting to test something like eldev docker 26 docker 25 test, i.e. a Docker within a Docker. Largely for fun, but also to potentially spot some otherwise hard-to-discover errors. Since this is going to be slow, really limit this to one test.

We'd have to install docker on the images also :smile:

doublep commented 2 years ago

I've just thought of a horrific hack

If you have looked at Eldev code enough, you must know that I'm not against hacks. The source code is full of hacks and workarounds. Whatever works, unless it really works "by side effect only" or is likely to break with any small future change in the program we are integrating with here (i.e. Docker).

[...] This appear s to work, though seems a bit fragile, and we'd have to mount the global cache and config inside this new home, etc. I don't know if I'd call this an "easier way", but wdyt?

As it is looks fine to me. I'm only not sure I understand the meaning of "mount the global cache and config inside this new home, etc.".

This fake root should be placed somewhere inside (eldev-cache-dir t t). The binary we can just copy from (locate-file (format "bin/eldev" script) load-path), see eldev-upgrade-self.

I don't know if I'd call this an "easier way"

It is easier if we don't have to create and maintain (remember about rebuilding when a new Emacs version comes) extra images. And even better if it allows users to use their own images. I don't care if it results in 10, 30 or 100 more lines of Elisp code.

It would be interesting to test something like eldev docker 26 docker 25 test, i.e. a Docker within a Docker. Largely for fun, but also to potentially spot some otherwise hard-to-discover errors. Since this is going to be slow, really limit this to one test.

We'd have to install docker on the images also

Crap, didn't think about that. Oh well ;)

LaurenceWarne commented 2 years ago

Cool, it looks to have fixed the permissions on CI (lmk if you encounter any more).

As it is looks fine to me. I'm only not sure I understand the meaning of "mount the global cache and config inside this new home, etc.".

Just that we have to change the target mount points on the container, e.g. /root/.eldev/config -> /project-dir/.eldev/docker-home/.eldev/config.

As suggested, the "fake home" on the host is created in eldev-cache-dir/docker-home. This directory needs to be created on the host prior to any eldev docker call else when docker creates it on the container it is owned by root (see eldev--docker-create-directories). It's currently persisted between runs (so you would be able to see it in .eldev). Do you think this should be deleted post any eldev docker run, or do you think something like changing the dir to eldev-cache-dir/.docker-home would be sufficient?

Edit: Also we don't use eldev on the eldev-ci images so we can use plain old Emacs images.

TODO

doublep commented 2 years ago

Do you think this should be deleted post any eldev docker run, or do you think something like changing the dir to eldev-cache-dir/.docker-home would be sufficient?

It can stay if it contains anything useful that could speed up future Docker runs. Otherwise delete it after all runs. However, future runs must not fail if the directory didn't get deleted for whatever reason. Also, no need to rename it with a dot, because cache directory is already hidden (it is that .eldev you have in projects).

Also we don't use eldev on the eldev-ci images so we can use plain old Emacs images.

Agreed. Also wanted to mention, but I'm too late.

Don't install eldev from melpa for the tests, use the system eldev

This is very important. Basically, when testing, I want to use Eldev that I'm developing right now for all tests. If I break eldev-exec in any way, I want both eldev exec and eldev docker ... exec blow up immediately.

General windows compat

This we can forget because of the hack. I guess it would be even better to add an explicit test for system-type in the new command and outright fail on systems that don't have user IDs, or at least not in the way we need/expect. I.e. let's only make it work where we are sure it will work. Others can try and make it support Windows later.

doublep commented 2 years ago

Looks like CI is broken again, this time on macOS. Let's wait for a while.

LaurenceWarne commented 2 years ago

It can stay if it contains anything useful that could speed up future Docker runs. Otherwise delete it after all runs. However, future runs must not fail if the directory didn't get deleted for whatever reason. Also, no need to rename it with a dot, because cache directory is already hidden (it is that .eldev you have in projects).

I've set it up so it's deleted after runs. But there should be no problem if it happens to exist prior to a run.

This is very important. Basically, when testing, I want to use Eldev that I'm developing right now for all tests. If I break eldev-exec in any way, I want both eldev exec and eldev docker ... exec blow up immediately.

Cool the last few commits should handle this.

This we can forget because of the hack. I guess it would be even better to add an explicit test for system-type in the new command and outright fail on systems that don't have user IDs, or at least not in the way we need/expect. I.e. let's only make it work where we are sure it will work. Others can try and make it support Windows later.

If you're ok with that, I've disabled the tests for windows. Though I've just noticed that on GH actions, mac os appears not to have docker installed :disappointed:

doublep commented 2 years ago

If you're ok with that, I've disabled the tests for windows. Though I've just noticed that on GH actions, mac os appears not to have docker installed

Yes, fine. Also fine to skip tests when Docker is not found, and as I understand it, your code does just that already. Please rebase on master, I skip tests on macOS now, because setup-emacs action no longer works there (already reported).

doublep commented 2 years ago

Please address the latest review (four points) and we can finally merge this. Before merging, please squash everything into one commit. Or, if you want, break out stuff that is not related to Docker per se, e.g. there could be a separate commit that adds :nolf in eldev--forward-process-output and adjusts the tests "broken" because of that.

Also, please remove changes to README.adoc. I commit documentation changes to a separate branch, so that what can be seen on the project page always reflects current stable release, and not ongoing development. If you want, file a separate PR against future-doc branch, or else I will just add the documentation manually later.

doublep commented 2 years ago

Ah, and also please rebase on master, so that I can run tests. Otherwise they would fail because macOS testing is broken for unrelated reasons (in master I already disabled it).

LaurenceWarne commented 2 years ago

Please address the latest review (four points) and we can finally merge this. Before merging, please squash everything into one commit. Or, if you want, break out stuff that is not related to Docker per se, e.g. there could be a separate commit that adds :nolf in eldev--forward-process-output and adjusts the tests "broken" because of that.

Done! I've been lazy and stuck everything all in one :slightly_smiling_face: Thanks for your patience.

Also, please remove changes to README.adoc. I commit documentation changes to a separate branch, so that what can be seen on the project page always reflects current stable release, and not ongoing development. If you want, file a separate PR against future-doc branch, or else I will just add the documentation manually later.

No problem, I can PR with the doc changes I have locally if you like.

Edit: also on the mac os stuff, I think (unable to test) that the commands might work if Emacs isn't launched as a GUI, but mac os doesn't use X (?) so I'm guessing the GUI stuff won't work.

doublep commented 2 years ago

Hm, the automated tests on GitHub servers pass, but I have problems running anything locally. Neither from Eldev project itself:

~/eldev$ eldev -dt docker 27 eval 1
Started up on Sat Nov 6 19:05:43 2021
Running on GNU Emacs 27.1 (build 1, x86_64-pc-linux-gnu, GTK+ Version 2.24.32)
 of 2020-10-27
Project directory: ‘/home/paul/eldev/’
Loading file ‘/home/paul/.eldev/config’...
Loading file ‘Eldev’...
Loading file ‘Eldev-local’...
Executing command ‘docker’...
Executing ‘eldev-executing-command-hook’...
Running command ’/usr/bin/docker run --rm -e HOME=/eldev/.eldev/docker-home -u 1000:1000 -v /home/paul/eldev/:/eldev -w /eldev -v ~/eldev:/eldev -v /home/paul/.eldev/config:/eldev/.eldev/docker-home/.eldev/config -v /home/paul/.eldev/global-cache:/eldev/.eldev/docker-home/.eldev/global-cache silex/emacs:27 sh -c ELDEV_LOCAL=/eldev /eldev/bin/eldev eval 1’

Output of the /usr/bin/docker run:
docker: Error response from daemon: Duplicate mount point: /eldev.
See 'docker run --help'.

/usr/bin/docker run exited with error code 125
Finished erroneously on Sat Nov 6 19:05:43 2021

nor from another, with a different error:

~/iter2$ eldev -dt docker 27 eval 1
Started up on Sat Nov 6 19:05:38 2021
Running on GNU Emacs 27.1 (build 1, x86_64-pc-linux-gnu, GTK+ Version 2.24.32)
 of 2020-10-27
Project directory: ‘/home/paul/iter2/’
Loading file ‘/home/paul/.eldev/config’...
Loading file ‘Eldev’...
Loading file ‘Eldev-local’...
Reading internal evaluation cache from file ‘.eldev/27.1/internal-eval.cache’...
Using cached value for form ‘eldev-project-main-file’ in directory ‘/home/paul/git/undercover/’: nil
Will use directory ‘~/git/undercover/’ as local dependency ‘undercover’ with loading mode ‘as-is’
Executing command ‘docker’...
Running command ’/usr/bin/docker run --rm -e HOME=/iter2/.eldev/docker-home -u 1000:1000 -v /home/paul/iter2/:/iter2 -w /iter2 -v ~/eldev:/eldev -v /home/paul/.eldev/config:/iter2/.eldev/docker-home/.eldev/config -v /home/paul/.eldev/global-cache:/iter2/.eldev/docker-home/.eldev/global-cache -v /home/paul/git/undercover/:/iter2/.eldev/docker-home/git/undercover/ silex/emacs:27 sh -c ELDEV_LOCAL=/eldev /eldev/bin/eldev eval 1’

Output of the /usr/bin/docker run:
docker: Error response from daemon: create ~/eldev: "~/eldev" includes invalid characters for a local volume name, only "[a-zA-Z0-9][a-zA-Z0-9_.-]" are allowed. If you intended to pass a host directory, use absolute path.
See 'docker run --help'.

/usr/bin/docker run exited with error code 125
Finished erroneously on Sat Nov 6 19:05:38 2021

Am I doing something wrong?

doublep commented 2 years ago

Ah, apparently it's caused by ~ in $ELDEV_LOCAL (here it is ~/eldev). Please fix this too before I merge, because you know the code better.

Also, after working around this I still get yet another error:

$ ELDEV_LOCAL=/home/paul/eldev eldev -dt docker 27 eval 1
Started up on Sat Nov 6 19:13:10 2021
Running on GNU Emacs 27.1 (build 1, x86_64-pc-linux-gnu, GTK+ Version 2.24.32)
 of 2020-10-27
Project directory: ‘/home/paul/eldev/’
Loading file ‘/home/paul/.eldev/config’...
Loading file ‘Eldev’...
Loading file ‘Eldev-local’...
Executing command ‘docker’...
Executing ‘eldev-executing-command-hook’...
Running command ’/usr/bin/docker run --rm -e HOME=/eldev/.eldev/docker-home -u 1000:1000 -v /home/paul/eldev/:/eldev -w /eldev -v /home/paul/.eldev/config:/eldev/.eldev/docker-home/.eldev/config -v /home/paul/.eldev/global-cache:/eldev/.eldev/docker-home/.eldev/global-cache silex/emacs:27 sh -c ELDEV_LOCAL=/eldev /eldev/bin/eldev eval 1’

Output of the /usr/bin/docker run:
docker: Error response from daemon: OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: rootfs_linux.go:76: mounting "/home/paul/.eldev/config" to rootfs at "/eldev/.eldev/docker-home/.eldev/config" caused: mount through procfd: not a directory: unknown: Are you trying to mount a directory onto a file (or vice-versa)? Check if the specified host path exists and is the expected type.
time="2021-11-06T19:13:10+01:00" level=error msg="error waiting for container: context canceled"

/usr/bin/docker run exited with error code 125
Finished erroneously on Sat Nov 6 19:13:10 2021
LaurenceWarne commented 2 years ago

Ah, apparently it's caused by ~ in $ELDEV_LOCAL (here it is ~/eldev). Please fix this too before I merge, because you know the code better.

Weird, I was using ~ also, maybe my terminal emulator was doing some kind of auto expansion, should be fixed now anyhow.

Also, after working around this I still get yet another error:

For some reason I got it into my head that ~/.eldev/config was a directory :facepalm:, so I think you having an existing file at that location was what was giving Are you trying to mount a directory onto a file (or vice-versa)?. This should be fixed now also, sorry!

doublep commented 2 years ago

Thank you, now it works, so merged! I massaged it a bit before committing, mostly the docstring of the new command.

I will release 0.10 with this new command and some testing support improvements (most important are already committed) next week.

Thanks for your patience.

Rather thank you. All the time I was afraid you'd say: "Fuck it with all the requests, I'm done here" ;). Then I'm not sure if I would come up with the clever hack for user mismatch and "file created by root" problem caused by it.

LaurenceWarne commented 2 years ago

Awesome! Thanks again for your time and quick responses.

"Fuck it with all the requests, I'm done here" ;).

This made me laugh :smile: