simonw / datasette

An open source multi-tool for exploring and publishing data
https://datasette.io
Apache License 2.0
9.47k stars 677 forks source link

jinja2_environment_from_request plugin hook #2225

Closed simonw closed 9 months ago

simonw commented 9 months ago

For Datasette Cloud I want the ability to have one Datasette instance serve different templates depending on the host of the incoming request - so I can have a private simon.datasette.cloud instance running default Datasette, but I can let users have a public simon.datasette.site instance which uses their own custom templates.

I don't want people running custom templates on *.datasette.cloud for security reasons - I dont want an XSS hole in a custom template being able to steal cookies or perform actions on behalf of signed-in users.

I tried implementing this at first with a monkeypatch, but ran into problems. I'm going to instead do a research spike to see if a plugin hook that allows plugins to influence the Jinja environment based on the incoming request is a clean and better mechanism for this.

simonw commented 9 months ago

Potential design:

def jinja2_environment_from_request(request, datasette, env):

Where env is the existing environment, and it's strongly suggested that this return an overlay on it. The plugin documentation could describe overlays in detail with an example.

That name is consistent with actor_from_request(datasette, request) and prepare_jinja2_environment(env, datasette).

Overlays look like this:

            jinja_env = self.jinja_env.overlay(
                loader=ChoiceLoader([FileSystemLoader("/tmp/custom-templates"), self.jinja_env.loader])
            )
simonw commented 9 months ago

One of the key things this will enable is a single Datasette instance that returns different sites (using different templates) on different hosts - someone with a side-project hobby like mine could run a single instance and host simonwillison.net and www.niche-museums.com from the same Datasette, with custom templates that vary based on the host.

simonw commented 9 months ago

A catch with that plan: https://github.com/simonw/datasette/blob/45b88f2056e0a4da204b50f5e17ba953fcb51865/datasette/app.py#L1567-L1580

That code implements the thing where custom templates can define new pages - but note that it doesn't have access to the request object at that point, which means it wouldn't be able to work differently for different hosts.

I can fix that by making it dynamic as opposed to running it statically once when the class is constructed, hope that won't be too much of a performance hit.

I could maybe fix that by having template loaders that cache their own list_templates() calls.

simonw commented 9 months ago

https://github.com/simonw/datasette/commit/b43d5c374a222c7dcf4a74c0b90390be858df33f is the first version that passes the existing Datasette tests, still needs tests and documentation and I should actually use it to make sure it solves the problem.

simonw commented 9 months ago

I'm going to make this into a documented internal method too: https://github.com/simonw/datasette/blob/b43d5c374a222c7dcf4a74c0b90390be858df33f/datasette/app.py#L439-L446

simonw commented 9 months ago

Here's the previous attempt at a plugin for this which used monkeypatching and didn't quite work:

from datasette.app import Datasette
from datasette.views.base import BaseView
from datasette.utils.asgi import Response
from datasette.utils import path_with_format
from jinja2 import ChoiceLoader, FileSystemLoader, Template
from typing import Any, Dict, List, Optional, Union
from datasette import Request, Context
import os

def is_custom_public_site(request):
    return (
        request
        and os.environ.get("DATASETTE_CUSTOM_TEMPLATE_DIR")
        and (
            request.host.endswith(".datasette.site")
            or request.host.endswith(".datasette.net")
        )
    )

def get_env(datasette, request):
    # if request and request.host != 'localhost' and not request.host.endswith('.cloud'):
    #     breakpoint()
    if is_custom_public_site(request):
        print("get_env", request, os.environ["DATASETTE_CUSTOM_TEMPLATE_DIR"])
        return datasette.jinja_env.overlay(
            loader=ChoiceLoader(
                [
                    FileSystemLoader(os.environ["DATASETTE_CUSTOM_TEMPLATE_DIR"]),
                    datasette.jinja_env.loader,
                ]
            ),
            enable_async=True,
        )
    return datasette.jinja_env

async def new_render_template(
    self,
    templates: Union[List[str], str, Template],
    context: Optional[Union[Dict[str, Any], Context]] = None,
    request: Optional[Request] = None,
    view_name: Optional[str] = None,
):
    if isinstance(templates, str):
        templates = [templates]

    # if request and request.host != 'localhost' and not request.host.endswith('.cloud'):
    #     breakpoint()

    # If all templates are strings
    if (
        request
        and isinstance(templates, list)
        and all(isinstance(t, str) for t in templates)
    ):
        # Check if request's host matches .datasette.site
        if is_custom_public_site(request):
            jinja_env = get_env(self, request)
            templates = [jinja_env.select_template(templates)]

    return await original_render_template(
        self=self,
        templates=templates,
        context=context,
        request=request,
        view_name=view_name,
    )

original_render_template = Datasette.render_template
Datasette.render_template = new_render_template

# TOtal copy-paste replacement of original BaseView.render method:
async def new_base_render(self, templates, request, context=None):
    context = context or {}
    jinja_env = get_env(self.ds, request)
    template = jinja_env.select_template(templates)

    # if request and request.host != 'localhost' and not request.host.endswith('.cloud'):
    #     breakpoint()

    template_context = {
        **context,
        **{
            "select_templates": [
                f"{'*' if template_name == template.name else ''}{template_name}"
                for template_name in templates
            ],
        },
    }
    headers = {}
    if self.has_json_alternate:
        alternate_url_json = self.ds.absolute_url(
            request,
            self.ds.urls.path(path_with_format(request=request, format="json")),
        )
        template_context["alternate_url_json"] = alternate_url_json
        headers.update(
            {
                "Link": '{}; rel="alternate"; type="application/json+datasette"'.format(
                    alternate_url_json
                )
            }
        )
    return Response.html(
        await self.ds.render_template(
            template,
            template_context,
            request=request,
            view_name=self.name,
        ),
        headers=headers,
    )

BaseView.render = new_base_render
simonw commented 9 months ago

Got it working! Here's my plugin, plugins/custom_templates_for_host.py:

from datasette import hookimpl
from jinja2 import ChoiceLoader, FileSystemLoader
import os

def is_custom_public_site(request):
    return (
        request
        and os.environ.get("DATASETTE_CUSTOM_TEMPLATE_DIR")
        and (
            request.host.endswith(".datasette.site")
            or request.host.endswith(".datasette.net")
        )
    )

@hookimpl
def jinja2_environment_from_request(request, env):
    if is_custom_public_site(request):
        print("get_env", request, os.environ["DATASETTE_CUSTOM_TEMPLATE_DIR"])
        return env.overlay(
            loader=ChoiceLoader(
                [
                    FileSystemLoader(os.environ["DATASETTE_CUSTOM_TEMPLATE_DIR"]),
                    env.loader,
                ]
            ),
            enable_async=True,
        )
    return env

And the tests, tests/test_custom_templates_for_host.py:

import pathlib
import pytest

@pytest.mark.asyncio
@pytest.mark.parametrize("path", ("/", "/test/dogs", "/test"))
@pytest.mark.parametrize(
    "host,expect_special",
    (
        (None, False),
        ("foo.datasette.cloud", False),
        ("foo.datasette.site", True),
        ("foo.datasette.net", True),
    ),
)
async def test_custom_templates_for_host(
    datasette, path, host, expect_special, tmpdir, monkeypatch
):
    templates = pathlib.Path(tmpdir / "custom-templates")
    templates.mkdir()
    (templates / "base.html").write_text(
        '{% extends "default:base.html" %}{% block footer %}Custom footer!{% endblock %}',
        encoding="utf-8",
    )
    monkeypatch.setenv("DATASETTE_CUSTOM_TEMPLATE_DIR", str(templates))
    headers = {}
    if host:
        headers["host"] = host
    response = await datasette.client.get(path, headers=headers)
    assert response.status_code == 200
    if expect_special:
        assert "Custom footer!" in response.text
    else:
        assert "Custom footer!" not in response.text
simonw commented 9 months ago

Documentation preview: https://datasette--2227.org.readthedocs.build/en/2227/plugin_hooks.html#jinja2-environment-from-request-datasette-request-env