pretalx / pretalx-docker

Docker setup for a complete pretalx installation. Community-sourced, not officially supported.
29 stars 39 forks source link

media folder not accessible when not in debug mode #62

Open robe2 opened 3 months ago

robe2 commented 3 months ago

When I run my pretalx with debug=false

/media/whatever always resolves to a 404, though the /static/ works fine.

I don't think the /static is reading from the right folder though, cause if I remove a file from where it should be, it still seems to read the file fine.

For my docker-compose.yml, I have this: The docker image, was built using the docker recipe in this repo + adding pretalx-public-voting and pretalx-public-pages plugins

version: '3'
services:
  pretalx:
    image: docker.osgeo.org/sac/pretalx-osgeo:latest
    container_name: pretalx
    restart: unless-stopped
    depends_on:
      - redis
      - db
    environment:
       PRETALX_FILESYSTEM_MEDIA: /data/media
       PRETALX_FILESYSTEM_STATIC: /data/static
    ports:
      - "8355:80"
    volumes:
      - ./conf/pretalx.cfg:/etc/pretalx/pretalx.cfg:ro
      - pretalx-data:/data
    labels:
       traefik.docker.network: "pretalxdocker"
       traefik.enable: "true"
:

volumes:
  pretalx-db:
  pretalx-data:
  pretalx-redis:

What I do notice is when I upload a file to an event, it does correctly go into the pretalx-data volume (even when debug=false), so it seems to be just the web url mapping that is off.

I also tried changing from docker volumes to just folder mounts and had the same issue.

rixx commented 3 months ago

pretalx doesn't serve files in /media, FWIW, that's the job of the webserver of choice (in this case traefic, apparently).

No idea how this shakes out with Docker and volumes etc, just as a hint to go error hunting. Happy to merge a PR.

robe2 commented 3 months ago

pretalx doesn't serve files in /media, FWIW, that's the job of the webserver of choice (in this case traefic, apparently).

No idea how this shakes out with Docker and volumes etc, just as a hint to go error hunting. Happy to merge a PR.

I'm running nginx as proxy, but on a different server, I tried removing those labels for traefik, which were copied from the boilerplate docker-compose.yml, but that sadly didn't help.

My nginx is pretty much boiler plate described here:

https://github.com/pretalx/pretalx-docker/blob/main/reverse-proxy-examples/nginx/Readme.MD

Except instead of localhost, it's the private network name of the server I have pretalx housed on.

I'm assuming it's something in the GUnicorn setup. Isn't that what is running as the internal webserver.

Neither the traefik or the nginx instructions, make mention of explicitly setting the location of the https://server/media and https://server/static paths.

I had assumed this was handled by:

PRETALX_FILESYSTEM_MEDIA: /data/media
PRETALX_FILESYSTEM_STATIC: /data/static

FWIW, this all used to work fine in pretalx v2.2.0 version which I'm upgrading from. The main difference I see between the docker-compose of then and now is the introduction of the these new system variables, and removing them from the pretalx.cfg file.

rixx commented 3 months ago

Yeah, this repo is entirely community-maintained and not officially recommended, as I have no interest in Docker, so I just merge all PRs this repo gets (that don't look obviously malicious). pretalx has never served files under /media on its own, that was always up to the web server, so some of the changes must have broken this. (Making me think it may be time to remove even the link to this repo from the official docs, tbh – probably better to not mention this repo at all rather than having people get frustrated with it not working.)

robe2 commented 3 months ago

Okay I found the commit that did it. I guess before nginx was included in the image.

https://github.com/pretalx/pretalx-docker/commit/702197a82d924d043dd8ac42cc9491387246d1ce

there probably should be some documentation around this, or have a docker-compose that launches an nginx or traefik container that people can remove if they so choose.

rixx commented 3 months ago

Happy to merge a PR to that end.

robe2 commented 3 months ago

Happy to merge a PR to that end.

I whipped up a docker-compose that includes an nginx container (and restores the nginx.conf file) so it behaves like it used to. I'll be happy to provide a PR for that, but probably better as a template or a commented out section in the Dockerfile.

My thought is just have it as a commented out section.

But it feels a little like overkill, in my case since I'm just going to proxy that nginx to my production nginx that handles the certs, but my webserver is on a separate server so doesn't have access to the folders or docker volumes of this server.

What's the reason why media can never be served except when in debug mode? I was thinking any concerns about access would be satisfied with the GUNICORN_FORWARDED_ALLOW_IPS but that didn't seem to work either.

If I am understanding correctly that the nginx and traefik.toml need to serve the /media files directly, then the instructions in reverse-proxy-examples/docker-compose and reverse-proxy-examples/nginx/Readme.MD are out of date as well, since they don't have a section for the /media folder

rixx commented 3 months ago

They aren't so much "out of date" as straight up wrong, as pretalx has always behaved like that. We follow the Django project guidance on this issue, which is to not use Django to serve user-uploaded files.

robe2 commented 3 months ago

They aren't so much "out of date" as straight up wrong, as pretalx has always behaved like that. We follow the Django project guidance on this issue, which is to not use Django to serve user-uploaded files.

In the 2.3, it was right, because internally there was an nginx running as part of the pretalx image, and you can see from that, that it did do as you described, specifically serve up the media, static and then delegate the other to GUNICORN which it had running on a unix socket.

https://github.com/pretalx/pretalx-docker/blob/v2.3.2/deployment/docker/nginx.conf#L48

But all that I think became invalid as soon as the nginx in the container was removed to save space.

My plan is to basically restore this file, but have it mount to the nginx instance I am adding to the docker compose.

So my revised docker-compose will look like this:

version: '3'
services:
  pretalx:
    image: pretalx/standalone:latest
    container_name: pretalx
    restart: unless-stopped
    depends_on:
      - redis
      - db
    environment:
       PRETALX_FILESYSTEM_MEDIA: /data/media
       PRETALX_FILESYSTEM_STATIC: /data/static
    volumes:
      - ./conf/pretalx.cfg:/etc/pretalx/pretalx.cfg:ro
      - pretalx-data:/data
:
  web:
    image: nginx:latest
    ports:
      - "80:80"
    depends_on:
      - pretalx
    volumes:
      - ./deployment/docker/nginx.conf:/etc/nginx/nginx.conf:ro
      - pretalx-data:/data

volumes:
  pretalx-db:
  pretalx-data:
  pretalx-redis:

and then changing this line https://github.com/pretalx/pretalx-docker/blob/v2.3.2/deployment/docker/nginx.conf#L61

to

proxy_pass http://pretalx:80;

I think similar could be done with traefik for those who'd prefer to get their cert etc as part of the docker compose launch, but I'm less familiar with traefik and have a dedicated nginx server on the private network that handles the certs already.

80:80 of the nginx container could be changed to 8355:80 if people have another local nginx that is monitoring their certs already running on 80.

almereyda commented 3 months ago

It is generally good practice to run separate containers for separate processes and even better to only run a single process per container.

What used to be the standalone container is now what would often we called the app container, while Nginx serves as web container that does all static hosting and forwarding to the app for appropriate routes. This comes with the need of a shared volume for Django's static assets regenerated during each migration.

I will be happy to review a PR, when it is filed, or to contribute one during the next setup that's ongoing until May first.

I can second many ideas and arguments from the discussion and the recent commits. Though I'm a bit surprised the setup is listening on port 80 of the host. Most probably there is going to be another reverse proxy, very often Traefik, that acts as web and webs endpoint. I think we can safely assume the Nginx/web container to be exposed to an internal web network only, while carrying just enough Traefik labels to document a complete live environment.

We can provide different compose.yaml, compose.live.yaml, compose.dev.yaml examples to the repository to show how things can be adapted. Because we already see that the expectation of a live deployment is not shared by everyone, as #43 by @MikkCZ clearly states and implies to support the development use case first.

We can also compare the practices here a bit against how other major Django projects do it. Pretix, Authentik, Mailman or FidusWriter come to mind.

And I agree that this repository is in some kind of messorganic shape. What are the Ansible folders for, for example? Happy to find them refactored into a separate repository later on.

It will also be good to revise the README with a documentation of the whole lifecycle of the application, including eventual intermediary steps during migrations.

Right now the repository is in an opinionated and underdocumented state, which we can gradually improve at the different places suggested above. More on Sunday.

rixx commented 3 months ago

I wasn't aware how broken this repository is at the moment, tbh, and it's looking a bit like "community-maintained" doesn't really work.

I will observe the situation for a while longer (say another three months, until ~August), and if the repository is still not in a usable state, I think it would be best to archive it. There are already at least three other Docker setups for pretalx up here on GitHub, so it's not like we'd be leaving people without anything to build on – and it would presumably be less frustrating to have to figure something out, than to try a repo that is (despite big warnings) hosted under the official pretalx org, only to find out it doesn't work. Or, as is the case here, only works with insecure settings, which is even worse, as it encourages people to run pretalx in development mode.

DASPRiD commented 2 months ago

I don't think that this is actually a bug. Pretalx itself does say that media (and preferably static) should be served by a separate application/container. This is really easy to do (e.g. with https://hub.docker.com/r/halverneus/static-file-server).

So I don't think that this repository is actually broken, beside the issue with env forwarding (for which I opened a PR already).

DASPRiD commented 2 months ago

Also I want to note: all three linked docker setups are working the same way, where they don't serve the media directory themself either.

Zaczero commented 1 month ago

Pretalx itself does say that media (and preferably static) should be served by a separate application/container.

Such core functionality should be a part of the application itself. It's nonsense to require a dedicated container just for serving media files.

almereyda commented 1 month ago

I would politely disagree. It is a common pattern (in container land; don't know about the rest) to offload serving static media files to statically compiled languages, while dynamic routes are computed with the interpreted language of the application.

Serving from the application is useful for development and testing instances. Load-bearing environments pose different constraints on the overall system.

rixx commented 1 month ago

It's nonsense to require a dedicated container just for serving media files.

A Docker setup is not an integral part of pretalx, so there is not "a dedicated container" required. pretalx (as-an-application) requires a web server for multiple purposes, like terminating SSL connections, and serving static files and media files is simply one of them. This is unlikely to change in the future.

Zaczero commented 1 month ago

@rixx But this is simply an odd design decision. Given my self-hosting experience, I have never encountered other software that misses such core functionality out-of-the-box (if you know of other tools like this, please share!). This does not seem like a technical limitation since it works in debug mode. I understand the performance side of things, but not all users care about the millisecond savings. With Docker, it's better to serve self-contained solutions instead of making it mandatory to set up additional software around it. A very small percentage of users operate at 'business' scale.

rixx commented 1 month ago

But this is simply an odd design decision.

@Zaczero We are following Django guidance on the topic, which does not recommend serving media files directly in production. Serving media files (aka user uploaded files) well and file-system agnostic (taking into account things like S3 buckets etc as possible backends) is not a trivial task.

This does not seem like a technical limitation since it works in debug mode.

See again the Django documentation linked above.

With Docker, it's better to serve self-contained solutions instead of making it mandatory to set up additional software around it. A very small percentage of users operate at 'business' scale.

pretalx is not built for a Docker setup, and will not make changes to accommodate Docker in particular, especially given the drawbacks. To follow your argument, a very small percentage of users operate pretalx using Docker.


This issue should be resolved when #64 is merged. Please keep comments in this repository on topic (that is: the docker setup); debating how pretalx works is neither on topic in this issue, nor likely to change anything.

Zaczero commented 1 month ago

I understand you may have your reasons and there is nothing wrong with that. I simply express my opinion as a self-hosted software user, comparing your approach to others. I am not saying your approach is incorrect, I am just saying it's atypical for self-hosted deployments. What you do with this feedback is completely up to you.

See again the Django documentation linked above.

"again" but this is the first time I'm reading your response 🙂.

almereyda commented 1 month ago

It is a lot less effort to keep a cleanly separated, containerised setup that follows Django recommendations for secure system setups, than to build and maintain a custom functionality that is not expected from the upstream developers.

I would suggest to discuss this subject with the Django upstream project.

robe2 commented 1 month ago

Sorry for causing such an upheaval. This docker repo works fine for me once I got passed the need of having a separate container for the nginx container. I wanted to wait a bit before submitting a patch since I'm still a little green about docker. Now given all the discussion, I think just having a separate yml file, maybe calling it docker-compose-nginx.yml might do.

We might also want to have one for folks wanting to run with postgres instead of mysql, or have that as some override file. I haven't gotten to that point yet of migrating this to postgres. I recall the complaint someone had about changing the main one to postgres was because it's a serious breaking change, but having a separate docker-compose for that makes sense to me for those with no upgrade issues to worry about.

I was debating if I should bother explaining how to get certs, but decided, I should leave that for others to figure out cause there are so many ways that is done.

I was also looking at weblate, which I use, and is a django app too and liked, but they do it the old way this docker used to do, by including the nginx, but using nginx lite https://github.com/WeblateOrg/docker-compose . Not sure if that would get rid of the annoyance of why nginx was removed cause it made the container too fat.

But given that some people might prefer to have their nginx / trafeik completely separate, I think having a docker config that includes an nginx container and showing how to configure to share the volume makes the most sense to me.

I'll try to submit my patch soon once I've cleaned it up a bit.

almereyda commented 1 month ago

@robe2 Please feel invited to check out the work done in #64 until we finish with polishing the documentation there.

robe2 commented 1 month ago

@robe2 Please feel invited to check out the work done in #64 until we finish with polishing the documentation there.

Okay I'll hold of on putting in a pull request until you are done as yours seems like it might change a lot. I see in your pull request you addressed this one to have a separate nginx container. so I think once your pull request is committed, as you noted, it will close this issue as well.