docker-library / drupal

Docker Official Image packaging for Drupal
280 stars 204 forks source link

When using fpm, it is not clear how to make image styles work #215

Closed alberto56 closed 2 years ago

alberto56 commented 2 years ago

Context

The FPM variant of the Drupal image is meant to be used with a reverse proxy. That is, the webserver is in a separate container from the Drupal code. The webserver then communicates with Drupal using TCP on port 9000.

The "php:-fpm" section of the Docker homepage for this image contains resources on how to use this.

Based on those resources and my own research, I have written the article PHP and Apache (or Nginx) in separate Docker containers using Docker Compose, March 25, 2022, Dcycle blog, which shows how to link the FPM tag of PHP with an Apache frontend.

I then tried to get this working with Drupal, and everything works perfectly except image styles.

Normally, Drupal image styles are generated only if they do not exist. Here is an example as per my understanding:

The problem

This works as expected with images which combine Apache and Drupal. However when using FPM, a request to /sites/default/files/styles/large/public/2022-03/kittens01.jpg, or indeed /sites/default/files/this/can/be/anything/that/does/not/exist, result in an Apache message to the effect that the file does not exist. Drupal is not called at all and does not have the opportunity to generate the image style.

To reproduce (example)

I maintain a project called Starterkit for a complete Drupal 9 site which I just converted to a two-container architecture (Drupal FPM, based on drupal:9-fpm).

You can see the issue by doing he following:

git clone https://github.com/dcycle/starterkit-drupalsite.git
cd starterkit-drupalsite
./scripts/deploy.sh

This will give you a result such as:

If all went well you can now access your site at:

 => Drupal: http://0.0.0.0:52720/user/reset/1/1648760983/75PxSllB0NFbeZMeV9mHmH-8SSHg47rp-6uH60Jx5FI/login

The port is random, yours will differ. Now you can run:

curl -f -I "http://0.0.0.0:52720/sites/default/files/styles/medium/public/2022-03/kittens01.jpg?itok=ptUkfU3k"

HTTP/1.1 404 Not Found
Date: Thu, 31 Mar 2022 21:11:44 GMT
Server: Apache/2.4.53 (Unix)
X-Content-Type-Options: nosniff
Content-Type: text/html; charset=iso-8859-1

curl: (22) The requested URL returned error: 404

I have surmised that when we use a reverse proxy, Drupal's 'index.php' is never called when a non-existing file inside /sites/default/files/* is requested.

I think there is something "special" about the non-FPM version of the Drupal image, which causes /sites/default/files/whatever/whatever to call Drupal's main index.php file. And that does not happen in the FPM version of the Drupal image.

Consider the following project

Project structure:

- web-files
  - subfolder
    - .htaccess
  - .htaccess
  - index.php
- docker-compose.yml
- Dockerfile-apache
- php.apache.conf

web-files/subfolder/.htaccess

This is the same .htaccess which is Drupal's sites/default/files:

# Turn off all options we don't need.
Options -Indexes -ExecCGI -Includes -MultiViews

# Set the catch-all handler to prevent scripts from being executed.
SetHandler Drupal_Security_Do_Not_Remove_See_SA_2006_006
<Files *>
  # Override the handler again if we're run later in the evaluation list.
  SetHandler Drupal_Security_Do_Not_Remove_See_SA_2013_003
</Files>

# If we know how to do it safely, disable the PHP engine entirely.
<IfModule mod_php5.c>
  php_flag engine off
</IfModule>
<IfModule mod_php7.c>
  php_flag engine off
</IfModule>

web-files/.htaccess

This is the same .htaccess in Drupal's webroot:

#
# Apache/PHP/Drupal settings:
#

# Protect files and directories from prying eyes.
<FilesMatch "\.(engine|inc|install|make|module|profile|po|sh|.*sql|theme|twig|tpl(\.php)?|xtmpl|yml)(~|\.sw[op]|\.bak|\.orig|\.save)?$|^(\.(?!well-known).*|Entries.*|Repository|Root|Tag|Template|composer\.(json|lock)|web\.config)$|^#.*#$|\.php(~|\.sw[op]|\.bak|\.orig|\.save)$">
  <IfModule mod_authz_core.c>
    Require all denied
  </IfModule>
  <IfModule !mod_authz_core.c>
    Order allow,deny
  </IfModule>
</FilesMatch>

# Don't show directory listings for URLs which map to a directory.
Options -Indexes

# Set the default handler.
DirectoryIndex index.php index.html index.htm

# Add correct encoding for SVGZ.
AddType image/svg+xml svg svgz
AddEncoding gzip svgz

# Most of the following PHP settings cannot be changed at runtime. See
# sites/default/default.settings.php and
# Drupal\Core\DrupalKernel::bootEnvironment() for settings that can be
# changed at runtime.

# PHP 7, Apache 1 and 2.
<IfModule mod_php7.c>
  php_value assert.active                   0
</IfModule>

# Requires mod_expires to be enabled.
<IfModule mod_expires.c>
  # Enable expirations.
  ExpiresActive On

  # Cache all files for 2 weeks after access (A).
  ExpiresDefault A1209600

  <FilesMatch \.php$>
    # Do not allow PHP scripts to be cached unless they explicitly send cache
    # headers themselves. Otherwise all scripts would have to overwrite the
    # headers set by mod_expires if they want another caching behavior. This may
    # fail if an error occurs early in the bootstrap process, and it may cause
    # problems if a non-Drupal PHP file is installed in a subdirectory.
    ExpiresActive Off
  </FilesMatch>
</IfModule>

# Set a fallback resource if mod_rewrite is not enabled. This allows Drupal to
# work without clean URLs. This requires Apache version >= 2.2.16. If Drupal is
# not accessed by the top level URL (i.e.: http://example.com/drupal/ instead of
# http://example.com/), the path to index.php will need to be adjusted.
<IfModule !mod_rewrite.c>
  FallbackResource /index.php
</IfModule>

# Various rewrite rules.
<IfModule mod_rewrite.c>
  RewriteEngine on

  # Set "protossl" to "s" if we were accessed via https://.  This is used later
  # if you enable "www." stripping or enforcement, in order to ensure that
  # you don't bounce between http and https.
  RewriteRule ^ - [E=protossl]
  RewriteCond %{HTTPS} on
  RewriteRule ^ - [E=protossl:s]

  # Make sure Authorization HTTP header is available to PHP
  # even when running as CGI or FastCGI.
  RewriteRule ^ - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization}]

  # Block access to "hidden" directories whose names begin with a period. This
  # includes directories used by version control systems such as Subversion or
  # Git to store control files. Files whose names begin with a period, as well
  # as the control files used by CVS, are protected by the FilesMatch directive
  # above.
  #
  # NOTE: This only works when mod_rewrite is loaded. Without mod_rewrite, it is
  # not possible to block access to entire directories from .htaccess because
  # <DirectoryMatch> is not allowed here.
  #
  # If you do not have mod_rewrite installed, you should remove these
  # directories from your webroot or otherwise protect them from being
  # downloaded.
  RewriteRule "/\.|^\.(?!well-known/)" - [F]

  # If your site can be accessed both with and without the 'www.' prefix, you
  # can use one of the following settings to redirect users to your preferred
  # URL, either WITH or WITHOUT the 'www.' prefix. Choose ONLY one option:
  #
  # To redirect all users to access the site WITH the 'www.' prefix,
  # (http://example.com/foo will be redirected to http://www.example.com/foo)
  # uncomment the following:
  # RewriteCond %{HTTP_HOST} .
  # RewriteCond %{HTTP_HOST} !^www\. [NC]
  # RewriteRule ^ http%{ENV:protossl}://www.%{HTTP_HOST}%{REQUEST_URI} [L,R=301]
  #
  # To redirect all users to access the site WITHOUT the 'www.' prefix,
  # (http://www.example.com/foo will be redirected to http://example.com/foo)
  # uncomment the following:
  # RewriteCond %{HTTP_HOST} ^www\.(.+)$ [NC]
  # RewriteRule ^ http%{ENV:protossl}://%1%{REQUEST_URI} [L,R=301]

  # Modify the RewriteBase if you are using Drupal in a subdirectory or in a
  # VirtualDocumentRoot and the rewrite rules are not working properly.
  # For example if your site is at http://example.com/drupal uncomment and
  # modify the following line:
  # RewriteBase /drupal
  #
  # If your site is running in a VirtualDocumentRoot at http://example.com/,
  # uncomment the following line:
  # RewriteBase /

  # Redirect common PHP files to their new locations.
  RewriteCond %{REQUEST_URI} ^(.*)?/(install\.php) [OR]
  RewriteCond %{REQUEST_URI} ^(.*)?/(rebuild\.php)
  RewriteCond %{REQUEST_URI} !core
  RewriteRule ^ %1/core/%2 [L,QSA,R=301]

  # Rewrite install.php during installation to see if mod_rewrite is working
  RewriteRule ^core/install\.php core/install.php?rewrite=ok [QSA,L]

  # Pass all requests not referring directly to files in the filesystem to
  # index.php.
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteCond %{REQUEST_FILENAME} !-d
  RewriteCond %{REQUEST_URI} !=/favicon.ico
  RewriteRule ^ index.php [L]

  # For security reasons, deny access to other PHP files on public sites.
  # Note: The following URI conditions are not anchored at the start (^),
  # because Drupal may be located in a subdirectory. To further improve
  # security, you can replace '!/' with '!^/'.
  # Allow access to PHP files in /core (like authorize.php or install.php):
  RewriteCond %{REQUEST_URI} !/core/[^/]*\.php$
  # Allow access to test-specific PHP files:
  RewriteCond %{REQUEST_URI} !/core/modules/system/tests/https?\.php
  # Allow access to Statistics module's custom front controller.
  # Copy and adapt this rule to directly execute PHP files in contributed or
  # custom modules or to run another PHP application in the same directory.
  RewriteCond %{REQUEST_URI} !/core/modules/statistics/statistics\.php$
  # Deny access to any other PHP files that do not match the rules above.
  # Specifically, disallow autoload.php from being served directly.
  RewriteRule "^(.+/.*|autoload)\.php($|/)" - [F]

  # Rules to correctly serve gzip compressed CSS and JS files.
  # Requires both mod_rewrite and mod_headers to be enabled.
  <IfModule mod_headers.c>
    # Serve gzip compressed CSS files if they exist and the client accepts gzip.
    RewriteCond %{HTTP:Accept-encoding} gzip
    RewriteCond %{REQUEST_FILENAME}\.gz -s
    RewriteRule ^(.*)\.css $1\.css\.gz [QSA]

    # Serve gzip compressed JS files if they exist and the client accepts gzip.
    RewriteCond %{HTTP:Accept-encoding} gzip
    RewriteCond %{REQUEST_FILENAME}\.gz -s
    RewriteRule ^(.*)\.js $1\.js\.gz [QSA]

    # Serve correct content types, and prevent double compression.
    RewriteRule \.css\.gz$ - [T=text/css,E=no-gzip:1,E=no-brotli:1]
    RewriteRule \.js\.gz$ - [T=text/javascript,E=no-gzip:1,E=no-brotli:1]

    <FilesMatch "(\.js\.gz|\.css\.gz)$">
      # Serve correct encoding type.
      Header set Content-Encoding gzip
      # Force proxies to cache gzipped & non-gzipped css/js files separately.
      Header append Vary Accept-Encoding
    </FilesMatch>
  </IfModule>
</IfModule>

# Various header fixes.
<IfModule mod_headers.c>
  # Disable content sniffing, since it's an attack vector.
  Header always set X-Content-Type-Options nosniff
  # Disable Proxy header, since it's an attack vector.
  RequestHeader unset Proxy
</IfModule>

web-files/index.php

Hello World

docker-compose.yml

Here's we're creating a bunch of services, including a basic drupal:9-based service, which works fine, and a two-container setup.

---
version: '3'

services:
  apache_and_files_drupal:
    image: drupal:9
    volumes:
      - "./web-files:/opt/drupal/web"
    ports:
      - 8767:80

  apache_and_files:
    image: php:apache
    volumes:
      - "./web-files:/var/www/html"
    ports:
      - 8766:80

  php:
    image: drupal:9-fpm-alpine
    volumes:
      - "./web-files:/var/www/html"

  apache_separate_from_files:
    build:
      context: .
      dockerfile: Dockerfile-apache
    volumes:
      - "./web-files:/var/www/html"
    ports:
      - 8765:80

Dockerfile-apache

FROM httpd:alpine

COPY php.apache.conf /usr/local/apache2/conf/php.apache.conf
RUN echo "Include /usr/local/apache2/conf/php.apache.conf" \
    >> /usr/local/apache2/conf/httpd.conf

php.apache.conf

ServerName localhost

LoadModule deflate_module /usr/local/apache2/modules/mod_deflate.so
LoadModule proxy_module /usr/local/apache2/modules/mod_proxy.so
LoadModule proxy_fcgi_module /usr/local/apache2/modules/mod_proxy_fcgi.so

<VirtualHost *:80>
    ProxyPassMatch ^/(.*\.php(/.*)?)$ fcgi://php:9000/var/www/html/$1
    DocumentRoot /var/www/html/
    <Directory /var/www/html/>
        DirectoryIndex index.php
        Options Indexes FollowSymLinks
        AllowOverride All
        Require all granted
    </Directory>

    # Send apache logs to stdout and stderr
    CustomLog /proc/self/fd/1 common
    ErrorLog /proc/self/fd/2
</VirtualHost>

Reproducing the issue

In the above folder, run:

docker-compose up -d

Then you can visit:

$ curl http://0.0.0.0:8766/
Hello World
$ curl http://0.0.0.0:8766/a/b/c
Hello World
$ curl http://0.0.0.0:8766/subfolder/a/b/c
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>404 Not Found</title>
</head><body>
<h1>Not Found</h1>
<p>The requested URL was not found on this server.</p>
<hr>
<address>Apache/2.4.53 (Debian) Server at 0.0.0.0 Port 8766</address>
</body></html>
$ curl http://0.0.0.0:8765/
Hello World
$ curl http://0.0.0.0:8765/subfolder/a/b/c
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>404 Not Found</title>
</head><body>
<h1>Not Found</h1>
<p>The requested URL was not found on this server.</p>
</body></html>
$ curl http://0.0.0.0:8765/subfolder/a/b/c
File not found.
$ curl http://0.0.0.0:8767/
Hello World
$ curl http://0.0.0.0:8767/a/b/c
Hello World
$ curl http://0.0.0.0:8767/subfolder/a/b/c/
Hello World

In the case of http://0.0.0.0:8767/subfolder/a/b/c/, this is based on the image drupal:9 which includes both Apache and Drupal, and the fact that we're seeing "Hello World" even though we're requesting the non-existing /subfolder/a/b/c/ is the desired behaviour because the base index.php file is being called (if this were a running Drupal site, it would allow Drupal to generate the image style).

In the case of curl http://0.0.0.0:8766/subfolder/a/b/c, we get a "not found": the index.php file is never executed. This uses php:apache, not drupal:9, which causes me to believe that drupal:9 is doing something server-wise which php:apache is not doing. I have tried to figure out what this is (a conf file? something else?) and could not do it. I would be greatly appreciative of any guidance to figure out what is special about the drupal:9 image which causes /subfolder/a/b/c/ to load index.php.

Finally, the case of http://0.0.0.0:8765/subfolder/a/b/c showing not found is what I'm trying to fix. I would like that to load "Hello World", the same way drupal:9 does. (the 8765 port loads the httpd:alpine image which then communicates with the drupal:9-fpm-alpine service via TCP, and I would like http://0.0.0.0:8765/subfolder/a/b/c to work the same way http://0.0.0.0:8767/subfolder/a/b/c/ -- the single container solution -- does).

Thanks for any guidance on how to get image styles to generate on a setup where FPM and the Apache webserver are in different containers.

alberto56 commented 2 years ago

In a two-container fpm-apache setup, if in ./sites/default/files/.htaccess I comment out the security-related lines, the image styles are properly generated:

...
# Set the catch-all handler to prevent scripts from being executed.
# SetHandler Drupal_Security_Do_Not_Remove_See_SA_2006_006
<Files *>
  # Override the handler again if we're run later in the evaluation list.
  # SetHandler Drupal_Security_Do_Not_Remove_See_SA_2013_003
</Files>
...

Note that in my setup that file exists in both the Drupal container and the Apache container, so it is changed in both containers. (Actually it is a volume shared by both containers)

Related: