jzohrab / lute

DEPRECATED: LUTE (Learning Using Texts) is a self-hosted web app for learning language through reading, based on Learning with Texts (LWT)
The Unlicense
118 stars 10 forks source link

docker-image #64

Closed disfated closed 10 months ago

disfated commented 1 year ago

This is the minimum viable Dockerfile, which you can build and push to the docker hub.

jzohrab commented 1 year ago

Thanks for the PR @disfated - I have a couple of questions I'll add to the files.

While I have worked with Docker in the past, I'm by no means an authority. My biggest question is around the docker volumes, I'm wondering how that will affect users who aren't as tech-savvy (and there are a lot of people coming into Lute who use docker yet who have trouble navigating the command line).

(Switched the branch to develop, code goes to develop for my testing and CI and then to master. I think my Lute PR docs need fixing.)

jzohrab commented 1 year ago

@disfated - I put a bunch of thoughts in this PR's comments, you don't need to respond to them. They're comments to help clear up my own thoughts, and for posterity.

In summary, I agree with a lot of what you say. Between the two evils, I think I'd prefer evil # 2, using mounts, just because it removes a bit of complexity from the user's side. There are some immediate improvements required to make this "mount" option more viable, which you mentioned:

When running, users would need to specify the data folder, and the backup folder ... I agree there is a chance for error there, but I don't know a way around that! As you mentioned, the goal for this stream of work is to make Lute easy to install, but -- in my mind, at least -- the better way to do that is to spend time in the python rewrite. (Which is a slog :-/ but I think it's necessary)

jzohrab commented 11 months ago

Following up. I've spent some hours reading through and trying out volumes, and feel that they're good for many ops-type situations, but that they could easily create confusion for non-techies, and then for the inevitable support questions that will come up.

To summarize, I'd like to move forward with pre-built containers that use bind mounts, and provide simpler compose files and .env files for that, with some extra checks and documentation. This is the only thing that I'm comfortable supporting at the moment.

Some notes below:

Volume management isn't as easy as file management

When no volume is explicitly passed in the container run command, VOLUME in the dockerfile creates an anonymous volume with a SHA id, e.g.

$ docker volume ls
DRIVER    VOLUME NAME
local     1a078b998378121f021cde0e70b1c3a3e26f37a013ad49fbd853856edef0b783
local     3e6eaac05d79693395279a6df813efc75b4ec629a4dff14b73798853ce1f7b93
... etc

While this should work fine for just getting up and running, I can't see a clear path for managing that for users easily. You can also specify the volume name in a docker compose file or a command line, but that also requires users to get the flags right -- it sounds dead easy, but users sometimes do fun things.

Users can create and mount volumes, but given the challenges that they have already, I don't feel that's an option.

Interestingly, specifying the volume name in a compose file seems to create different volumes depending on where the file is stored, per https://forums.docker.com/t/docker-compose-prepends-directory-name-to-named-volumes/32835/9. I created a zzdocker folder with a dockerfile and compose file:

version: "3.9"

services:
  frontend:
    image: xxxx:latest
    volumes:
      - myapp:/blah/data
volumes:
  myapp:

Running this created a named volume:

$ docker volume ls
DRIVER    VOLUME NAME
...
local     zzdocker_myapp

I then moved the compose.yml file to another location and ran compose up again, and a new volume was created.

$ docker volume ls
DRIVER    VOLUME NAME
...
local     zzdocker_myapp
local     zzzzdocker_myapp

Based on this, I'm going to go with mounts.

Mis-mount = big problems

As mentioned, mounts create some dangers: users could mount things incorrectly! So that would be terrible. That can be mitigated in a few ways:

  1. The .env for the application could be baked in to the image, along with some default settings (e.g. backup enabled, 10 backups, etc etc)

  2. a separate, much smaller .env file for docker compose to use could be created. This would just have two things:

DATA_PATH=xxxx
BACKUP_PATH=yyyy

Then the compose file would reference that .env file, and those would be used for the mounts. Composer does check that the directory exists when it composer ups.

  1. We can also check the content of the mount listing if needed in the entrypoint script, e.g:
#!/bin/sh

REQUIREDDIR=/lute/data
mountdata=`mount`
mountVar=`mount | grep "$REQUIREDDIR"`

# echo $mountdata

echo "got mountVar = $mountVar"

if [ -z "$mountVar" ]
then
    echo "$REQUIREDDIR not mounted ... exit"
    # echo $mountdata
    exit
else      
    echo "$REQUIREDDIR mounted"
    touch "${REQUIREDDIR}/hello.txt"
fi

So that if some users decided to mount things directly from the command line docker run -whatever-flags ... it would at least safeguard that.

  1. Documentation! Once things are configured. With a big red exclamation point asking people to verify the folders they expected to get populated are in fact getting populated.

buildx is still needed

For the image to work on things like Raspberry Pi, different linux arch's etc, the build still needs to output multi-arch. This is a separate note from the Dockerfile and compose, just specifying it here as part of the build necessary. A good article is here: https://www.docker.com/blog/multi-arch-build-and-images-the-simple-way/

Summary

Pre-built containers is still the way to go for this version of Lute. I've been getting some basic things down for v3 Python, and think that that will be much easier to work with, especially since sqlite is baked in to Python.

jzohrab commented 10 months ago

Closing this, thanks @disfated for the discussion and notes.

I've implemented pre-built docker containers in lute v3 (https://github.com/jzohrab/lute-v3/) with a semi-automated way to push images to docker hub (https://hub.docker.com/repository/docker/jzohrab/lute3/general). The lute v3 docs (https://jzohrab.github.io/lute-manual/install.html) also refer to those pre-built images, it's a much better intro.

Also implemented are "settings" so there's no more '.env' file for people to hack at.

Phew!