NathanVaughn / webtrees-docker

Up-to-date Docker image for webtrees with all the bells and whistles.
https://hub.docker.com/r/nathanvaughn/webtrees
MIT License
64 stars 15 forks source link

/data folder accessible over web #87

Closed Neriderc closed 2 years ago

Neriderc commented 2 years ago

The webtrees security page says to ensure the /data folder is not accessible, and provides a simple test to check that.

When accessing webtrees.mydomain.com/data/config.ini.php I don't get the access denied error, and instead see the semi-colon they state as expected when the data folder is accessible.

I've tried changing the permissions on the mapped /data volume on the host machine, but this did not seem to help.

Any tips on how to resolve this?

NathanVaughn commented 2 years ago

I just tried this on my own instance on Webtrees, and I got permission denied as expected. There should be an .htaccess file in the data folder that denies all access like in the screenshot below:

image

I'm currently on business travel, but I will try to remember to investigate this when I get back. A new(ish) version of Webtrees may not have the .htaccess file in the release .zip or something, and it only works for me because I have the volume mounted.

Neriderc commented 2 years ago

Thanks for the reply. I have checked and that .htaccess file is missing. I've copied it across from the main webtrees repository, where it does seem to be in the latest release.

As a test, I created a new instance of webtrees using this docker image (latest tag) and when it generated the /data directory it again failed to create the .htaccess file, so I'm not sure what's wrong :(

It would be interesting if, when you get time, you (or someone else) could create a fresh install and see if you see the same problem.

NathanVaughn commented 2 years ago

I'm not sure what's going on. I started the ghcr.io/nathanvaughn/webtrees:latest container with no environment variables set and attached a console to it, and I can see the .htaccess file in the /var/www/webtrees/data directory. I had not run the setup at all.

image

Even if you weren't mounting the /var/www/webtrees/data/ directory, that file should still be persisted as it's in the base container. My best guess is at some point you inadvertently deleted that file when the directory was persisted in a host bind/volume and Webtrees I guess never automatically re-creates it.

I do have a desire to re-create the init script in Python so I'll probably add some functionality to recreate that file if it does not exist.

Neriderc commented 2 years ago

Ok I've had a bit of a play, and I can reproduce the cause, even if I don't understand it.

Here's a slightly modified version of your example docker-compose.yml file:

version: "3"

services:
  app:
    depends_on:
      - db
    image: ghcr.io/nathanvaughn/webtrees:latest
    ports:
      - 8095:80
    restart: unless-stopped
    #volumes:
    #  - ./app_data:/var/www/webtrees/data/
    #  - ./app_media:/var/www/webtrees/media/

  db:
    command: "--default-authentication-plugin=mysql_native_password"
    environment:
      MYSQL_DATABASE: "webtrees"
      MYSQL_USER: "webtrees"
      MYSQL_ROOT_PASSWORD: "badpassword"
      MYSQL_PASSWORD: "badpassword"
    image: mariadb:latest
    restart: unless-stopped
    #volumes:
    #  - ./db_data:/var/lib/mysql

Running this works fine, the container runs, setup is available, and accessing webtrees.mydomain.com/data/config.ini.php results in the expected access denied error.

Now if I uncomment the above lines related to declaring volumes, this breaks it. If I recreate the container with these volume mappings, for some reason it does not create the .htaccess file. The mapped directories are not created ahead of time, I let docker create them to ensure correct ownership.

This is a bit confusing since the config.ini.php is created fine in the same directory. I'm not sure what could cause this?

This is a slightly different setup from your example file which has named volumes not mapped to the host (if that's the right terminology), but I want all my volumes mapped to a specific host directory for organisation and backup reasons.

Do you have any idea why mapping the volumes like this would cause the issue?

NathanVaughn commented 2 years ago

Oh man, that's a good catch. It's with how Docker treats volumes. If you do a named volume, I think Docker pre-populates it with anything already in the container. If you do a host bind mount however, then Docker replaces the contents of the container with whatever is on the host, it doesn't like sync the differences or anything.

A good fix for this will be to update the container entrypoint script to also check if that file does not exist and automatically re-create it. I started to do some work on that this weekend on the new-entrypoint branch but have not finished it yet. https://github.com/NathanVaughn/webtrees-docker/blob/b94cbeb58d76b83bd73ab5aaa03744e18b829017/docker-entrypoint.py#L278-L289

Neriderc commented 2 years ago

There isn't anything on the host when first started (not even the mapped directories), so Docker wouldn't have to sync any differences. In other containers I haven't had issues doing it this way. Some even map the whole of the www directory to the host and it seems to work fine, so it's interesting that this doesn't like it - and only has an issue with this one file. My experience seems to be that if you don't create the directory then docker will create and populate it.

I have no idea what makes this situation different, but you do seem to have a solution to this specific problem. I do wonder if there's a more specific root cause that it could be narrowed down to but I think given you have a solution on your side (and it's outside of your recommended setup anyway), and given I have it fixed on my side, then there's probably not much point spending more time on it. Happy for this to be closed if you aren't intending on using it to track your changes against an issue.

NathanVaughn commented 2 years ago

This should be resolved now with the latest push of the :2.0.21 tag and new entrypoint script