sile-typesetter / casile

The CaSILE toolkit, a book publishing workflow employing SILE and other wizardry.
GNU Affero General Public License v3.0
54 stars 6 forks source link

Having trouble running casile from docker #152

Closed arunvickram closed 1 year ago

arunvickram commented 1 year ago

Hi @alerque,

I tried pulling the latest casile docker image, and this is the kind of error I'm getting. I'm running it casile-demos

I did alias casile='docker run -it --volume "$(pwd):/data" --user "$(id -u):$(id -g)" ghcr.io/sile-typesetter/casile:latest' and then casile setup, and this is what the result was:

┏━ Welcome to CaSILE v0.10.12! ┣━ Configuring repository for use with CaSILE ┗━ CaSILE run complete Error: Custom { kind: InvalidInput, error: "setup-error-not-git" }

alerque commented 1 year ago

Are you running this on a Git clone of casile demos or just a copy of the directory (e.g. extracted from a ZIP file)?

I've spent a couple hours fighting with CaSILE trying to work around new Git restrictions that are complicating Docker paths, but this error doesn't have anything to do with that. It just looks like you are not actually in a real Git directory.

$ pwd
/home/caleb/projects/sile-typesetter/casile-demos

$ alias casile='docker run -it --volume "$(pwd):/data" --user "$(id -u):$(id -g)" ghcr.io/sile-typesetter/casile:latest'

$ casile setup
┏━ Welcome to CaSILE ⁨v0.10.12⁩!
┣━ Configuring repository for use with CaSILE
┠─ Everything seems to be ship shape, warm up the presses!
┣━ Building target(s) using ‘make’
┠┄ Existing .gitignore file is up to date
┠┄ Setting default length of short SHA hashes in repository
┠┄ Reseting version tracked file timestamps to last affecting commit
┗━ CaSILE run complete

$ git describe --always HEAD
c345a91
arunvickram commented 1 year ago

I git cloned casile-demos.

$ git describe --always HEAD
ef691e9
alerque commented 1 year ago

That's fine, it happens to be a debug commit I pushed a little while back but no functional changes between that and the current debug one.

Try this from the same casile demos checkout:

$ docker run -it --volume "$(pwd):/data" --user "$(id -u):$(id -g)" --entrypoint zsh ghcr.io/sile-typesetter/casile:latest -c 'git describe --always HEAD'

You should get the same answer, but I'm trying to find where the breakdown here is. It seems like your project isn't getting mounted inside the docker container.

arunvickram commented 1 year ago

When running that command, I'm getting this error:

fatal: detected dubious ownership in repository at '/data'
To add an exception for this directory, call:

    git config --global --add safe.directory /data
alerque commented 1 year ago

What shell are you using? That error indicates that the user IDs are not being mapped correctly, and the first error you opened this issue suggests maybe the directory isn't being mounted. Both of those might mean the $(pwd), $(id -u) and similar command substitutions are not working in your shell.

arunvickram commented 1 year ago

I'm currently using zsh.

alerque commented 1 year ago
$ echo $SHELL
$ echo $(pwd) $(id -u)
alerque commented 1 year ago

ZSH shouldn't be a problem, that's what I use by default so it's the least likely to have issues with my commands!

arunvickram commented 1 year ago

I'm getting this result:

$ echo $(pwd) $(id -u)
/Users/arun/Documents/casile-demos 501
alerque commented 1 year ago

What OS? And is your docker service running as root or some limited privileged user?

arunvickram commented 1 year ago

I'm running on MacOS. I'm currently using the Docker Desktop app, so I'm not 100% whether it's running in root or not

alerque commented 1 year ago

Got it. Let my try to fix the other known issue, then I'll look into this again. The GitHub Action for this project stopped working because of security changes in Docker & Git that tricked down and landed in the releases I made today, which is a serious regression. Once that's going I'll try to figure out what issue you are hitting. They might even be related, c.f. https://github.com/actions/runner/issues/2033

alerque commented 1 year ago

I'm hoping the GitHub Action related issues are fixed in v0.10.13, but there are a few parts of the testing that only really work on an actual tagged release so I had to tag it to find out. While it builds I'm going to bed. I'll check in tomorrow and see if my book builds will run again. The new Git security restrictions and GH Actions Docker shinanigans are actually pretty obnoxious.

I'm hoping your issue might be fixed as a side effect of other re-arrangements, but I'm not sure. The macOS desktop app may or may not have some of the same changes. Give it a shot with v0.10.13 when the Docker images are up.

If not I have some more ideas about what to run to test it now ;-)

arunvickram commented 1 year ago

I'm still getting that "not a git repository error" unfortunately.

alerque commented 1 year ago

Lets have a look at what things look like using a shell inside the Docker container. Note with v0.10.13 and later you shouldn't normally need the --user mapping stuff any more, but you do have to be in a Git repository and pass that as a volume to inside the container.

Here's what is look like for me when I cd into a clone of casile-demos:

$ pwd
/home/caleb/projects/sile-typesetter/casile-demos
$ docker run -itv "$(pwd):/data" --entrypoint zsh ghcr.io/sile-typesetter/casile:v0.10.13 -c "pwd;ls -al;git describe --always;;casile-entry.zsh status"
/data
total 212
drwxr-xr-x 1 1000 1000    450 Feb  1 14:55 .
drwxr-xr-x 1 root root    150 Feb  1 15:10 ..
drwxr-xr-x 1 1000 1000    124 Feb  1 10:35 .casile
drwxr-xr-x 1 1000 1000    266 Feb  1 14:55 .git
-rw-r--r-- 1 1000 1000     63 Apr  8  2022 .gitattributes
drwxr-xr-x 1 1000 1000     18 Jan  2  2021 .github
-rw-r--r-- 1 1000 1000   1573 Jan 19  2021 .gitignore
-rw-r--r-- 1 1000 1000      0 Jan 19  2021 .gitmodules
-rw-r--r-- 1 1000 1000   1210 Jan 21  2017 LICENSE
-rw-r--r-- 1 1000 1000   2278 Jan 20  2021 README.md
-rw-r--r-- 1 1000 1000    349 Apr  8  2022 casile.mk
-rw-r--r-- 1 1000 1000 132867 Feb  1 10:40 docker
drwxr-xr-x 1 1000 1000    124 Jan 31 12:20 fancy_book-chapters
-rw-r--r-- 1 1000 1000    426 Feb  1 10:02 fancy_book-manifest.yml
-rw-r--r-- 1 1000 1000    919 Jan 31 12:28 fancy_book.lua
-rw-r--r-- 1 1000 1000    136 Feb  4  2020 fancy_book.md
-rw-r--r-- 1 1000 1000    283 Feb  4  2020 fancy_book.yml
-rw-r--r-- 1 1000 1000  30211 Feb  1 10:39 local
-rw-r--r-- 1 1000 1000    196 Feb  1 10:02 simple_book-manifest.yml
-rw-r--r-- 1 1000 1000     43 Feb  4  2020 simple_book.md
-rw-r--r-- 1 1000 1000     34 Feb  4  2020 simple_book.yml
7b827c6
error: could not lock config file /etc/gitconfig: Permission denied
┏━ Welcome to CaSILE ⁨v0.10.13⁩!
┣━ Scanning project status
┠─ Is the path a Git repository? Yes
┠─ Is the system’s ‘make’ executable? Yes
┠─ Are we not in the CaSILE source repository? Yes
┠─ Can we write to the project base directory? Yes
┠─ Is the system’s ‘make’ a supported version of GNU Make? Yes
┠─ Everything seems to be ship shape, warm up the presses!
┗━ CaSILE run complete

Note that the config lock error is innocuous and will be fixed in the next release.

At least for me this shows that the volume passed on the command line got mounted to data and everything is there as expected. I guess we're looking for what happened to your project files when you get inside the container that's different than mine.

arunvickram commented 1 year ago

It seems to work for me when I do this:


$ docker run -itv "$(pwd):/data" --entrypoint zsh ghcr.io/sile-typesetter/casile:v0.10.13 -c "pwd;ls -al;git describe --always;;casile-entry.zsh status"
/data
total 44
drwxr-xr-x 16 root root  512 Jan 31 16:35 .
drwxr-xr-x  1 root root 4096 Feb  1 15:27 ..
drwxr-xr-x 12 root root  384 Jan 31 16:35 .git
-rw-r--r--  1 root root   63 Jan 31 16:35 .gitattributes
drwxr-xr-x  3 root root   96 Jan 31 16:35 .github
-rw-r--r--  1 root root 1573 Jan 31 16:35 .gitignore
-rw-r--r--  1 root root    0 Jan 31 16:35 .gitmodules
-rw-r--r--  1 root root 1210 Jan 31 16:35 LICENSE
-rw-r--r--  1 root root 2278 Jan 31 16:35 README.md
-rw-r--r--  1 root root  349 Jan 31 16:35 casile.mk
drwxr-xr-x  6 root root  192 Jan 31 16:35 fancy_book-chapters
-rw-r--r--  1 root root  919 Jan 31 16:35 fancy_book.lua
-rw-r--r--  1 root root  136 Jan 31 16:35 fancy_book.md
-rw-r--r--  1 root root  283 Jan 31 16:35 fancy_book.yml
-rw-r--r--  1 root root   43 Jan 31 16:35 simple_book.md
-rw-r--r--  1 root root   34 Jan 31 16:35 simple_book.yml
ef691e9
┏━ Welcome to CaSILE v0.10.13!
┣━ Scanning project status
┠─ Is the path a Git repository? Yes
┠─ Is the system’s ‘make’ executable? Yes
┠─ Is the system’s ‘make’ a supported version of GNU Make? Yes
┠─ Are we not in the CaSILE source repository? Yes
┠─ Can we write to the project base directory? Yes
┠─ Everything seems to be ship shape, warm up the presses!
┗━ CaSILE run complete
alerque commented 1 year ago

Okay how about back to using the default entry point:

$ docker run -itv "$(pwd):/data" ghcr.io/sile-typesetter/casile:v0.10.13 status

I don't understand why inspecting it in a shell would have changed anything.

arunvickram commented 1 year ago

Looks like it works 🤷

❯ docker run -itv "$(pwd):/data" ghcr.io/sile-typesetter/casile:v0.10.13 status
┏━ Welcome to CaSILE v0.10.13!
┣━ Scanning project status
┠─ Is the path a Git repository? Yes
┠─ Is the system’s ‘make’ executable? Yes
┠─ Is the system’s ‘make’ a supported version of GNU Make? Yes
┠─ Are we not in the CaSILE source repository? Yes
┠─ Can we write to the project base directory? Yes
┠─ Everything seems to be ship shape, warm up the presses!
┗━ CaSILE run complete

Edit: This may be the offending factor: --user "$(id -u):$(id -g)"

alerque commented 1 year ago

The user ID mapping --user "$(id -u):$(id -g)" business was necessary before v0.10.13. I had to create a workaround that detects the owner of the files and changes to that UID because GitHub actions stopped giving us a way to work around their Docker containers always running as root. The --user flag can still be used and should not cause an error as long as the UID and GID you are passing in match the ownership of the files you are passing in.

Is it possible that your original problem above is that you Git cloned the repo as root or some other user but were trying to run CaSILE as a different user? i.e. is the 501 that showed up above as the output of id -u the owner of the files your are trying to work with? I suspect it is not and you somehow cloned the repository as root but were trying to run CaSILE as your user.

If that turns out to be the problem the solution for you would be to clone and run your tools as the same user, and the outcome of this issue should be better detection and errors on mismatched privileges so future users that end up in that position aren't so confused.

arunvickram commented 1 year ago

That wasn't it. I cloned and ran the command using the user arun. But at least now when I run casile setup it works, although running casile make freezes the whole thing

❯ docker run -itv "$(pwd):/data" ghcr.io/sile-typesetter/casile:v0.10.13 make
┏━ Welcome to CaSILE v0.10.13!
┠─ Everything seems to be ship shape, warm up the presses!
┣━ Building target(s) using ‘make’
^C┗━ CaSILE run complete
Error: Custom { kind: InvalidInput, error: "Failed to execute a subprocess for ‘make’." }
alerque commented 1 year ago

The output in this comment shows that the /data mount inside the container has all the files being owned by root, in other words UID 0. That reflects the UIDs of the files on the host system too even though you are looking at the mount from the container there. On my output for the same commands you can see all the files are owned by UID 1000, which is my UID on the host side. I would have expected those to be owned by UID 501 in your case.

That's not to say there are not some gremlins in the system. There may be platform differences I don't know to account for yet on the macOS side which I'll keep looking into as well (casile make should run on that demos repo for you as it does for me) but something is weird about those file ownerships.

arunvickram commented 1 year ago

Got it, my bad. In that case, there's definitely some discrepancy going on with regards to the file ownership in the container in the git repository itself. In that case, what do you recommend is the best course of action? Just play with casile and install the dependencies directly?

alerque commented 1 year ago

CaSILE has a lot of dependencies and I've never worked through installing them all anywhere but Linux, and almost exclusively Arch Linux at that. I would expect Docker to be a much easier way to get started before going down that road (although it would be nice to work that out sometime).

Docker has been pretty solid for me running CI jobs on both GitHub and GitLab and several self hosted situations. I even use it to build by own projects using old versions of CaSILE when I don't feel like updating them to the current version I have installed on my bare metal.

I think the issue here is that some recent security discoveries in Git & Docker caused a rush of "fixes" to shut the door on possible risk factors that were not very well thought through nor was migration made easy. I'm still discovering what those changes were. Back a month ago when I released everything was smooth, then suddenly everything stopped working and the last couple releases have only gotten some of the new kinks worked out.

arunvickram commented 1 year ago

Got it. I suppose for right now what I can do is that I can perhaps begin to work on the icu-lua and fluent-lua repositories and see if I can't get started with that first. It seems to me that using CaSILE might be a bit of a non-starter for my use case at the moment.

I've also been playing around with SILE itself, I think it's pretty good and promising. I do want to see if I can potentially integrate SILE directly as part of the pandoc-spiceland project (I should be able to for the most part, it's just a matter of using the custom pandoc implementation for SILE). But I also had the whimsical idea of also creating a layout engine from scratch (mainly to learn how one is made myself, and if it somehow becomes solidified, it could be integrated as part of pandoc-spiceland).

alerque commented 1 year ago

When v0.10.10 came out and I realized remote docker stuff was so screwed up I got pretty focused on that, then GH Actions started changing things up and Git changed security models and ... and ...

... In all that chaos I missed the fact that GNU Make got updated from 4.3 to 4.4 and changed a ton of things. They fixed some bugs that I was working around (and hence my workarounds broke) and outright changed some behaviour. CaSILE was mostly still working, but weird things were getting skipped and sometimes it would get in a loop, etc.

I think the differences are mostly accounted for now. As of v0.10.15 everything should be GNU Make 4.4 compatible. In fact it is now a minimum requirement because I'm not interested in fiddling with it enough to keep backwards compatibility. In Docker this shouldn't matter since the required version is provided.

All that to say I would be interested now whether the Docker image for v0.10.15 works as expected on MacOS now. You can see from the casile-demos repo that the CI build is working again and posting the build results as attachments to the CI runs.

arunvickram commented 1 year ago

It works now. I'm able to generate both books. Thanks!