python-babel / flask-babel

i18n and l10n support for Flask based on Babel and pytz
https://python-babel.github.io/flask-babel/
Other
432 stars 159 forks source link

LazyString's `__html__` is an incomplete implementation of MarkupSafe #224

Open jace opened 1 year ago

jace commented 1 year ago

MarkupSafe's Markup class, as used in Flask, is very careful about mixing escaped and unescaped strings. For instance, Markup(...).format(...) will ensure all format variables are escaped before being interpolated into the string, and the resulting string is fully escaped.

LazyString does not do this, and the presence of a __html__ method (as previously raised in #121) creates a situation where there is no way to tell whether a string is properly escaped or not:

>>> from flask_babel import lazy_gettext
>>> from markupsafe import Markup
>>> l = lazy_gettext("This is a <em>string</em> with a {var}")
>>> l
l'This is a <em>string</em> with a {var}'

>>> Markup(l)
Markup('This is a <em>string</em> with a {var}')

>>> Markup(l.format(var="variable & more"))
Markup('This is a <em>string</em> with a variable & more')

>>> Markup(l).format(var="variable & more")
Markup('This is a <em>string</em> with a variable &amp; more')

When a lazy string has format variables, it must be wrapped in Markup() before calling .format() to make it continue to behave as a HTML string. However, this is dangerous to do in a function that receives the string as parameter. Markup wrapping must happen at source, but that is also not possible in a lazy context as it causes a string evaluation.

Here is a test case showing how gettext, lazy_gettext and Markup all behave differently. As a result, neither translator nor programmer has any indication on whether any given string is plain text or HTML, and every string will need a full integration test to confirm markup and escaping are handled appropriately across translations.

Possible mitigations:

from flask import Flask, Markup
from flask_babel import Babel, gettext, lazy_gettext

import pytest

@pytest.fixture(scope='session')
def app():
    return Flask(__name__)

@pytest.fixture(scope='session')
def babel(app):
    return Babel(app)

@pytest.fixture()
def ctx(app, babel):
    with app.test_request_context() as context:
        yield context

raw_string = "This is a <em>string</em> with a {var}"

get_texts = [
    pytest.param(lambda: gettext(raw_string), id='str'),
    pytest.param(lambda: lazy_gettext(raw_string), id='lazy'),
    pytest.param(lambda: Markup(raw_string), id='markup'),
]

@pytest.mark.usefixtures('ctx')
@pytest.mark.parametrize('get_text', get_texts)
def test_gettext_type(get_text):
    text = get_text().format(var="variable & more")
    assert isinstance(text, str)

@pytest.mark.usefixtures('ctx')
@pytest.mark.parametrize('get_text', get_texts)
def test_gettext_value(get_text):
    text = get_text().format(var="variable & more")
    assert text == "This is a <em>string</em> with a variable &amp; more"

@pytest.mark.usefixtures('ctx')
@pytest.mark.parametrize('get_text', get_texts)
def test_gettext_html(get_text):
    text = get_text().format(var="variable & more")
    assert '__html__' in text

Output (with errors interpolated):

FAILED lazystr_test.py::test_gettext_value[str] - AssertionError: assert equals failed
E         'This is a <em>string</em> with a variable & more'      'This is a <em>string</em> with a variable &amp; more'
FAILED lazystr_test.py::test_gettext_value[lazy] - AssertionError: assert equals failed
E         'This is a <em>string</em> with a variable & more'      'This is a <em>string</em> with a variable &amp; more'
FAILED lazystr_test.py::test_gettext_html[str] - AssertionError: assert '__html__' in 'This is a <em>string</em> with a variable & more'
E       AssertionError: assert '__html__' in 'This is a <em>string</em> with a variable & more'
FAILED lazystr_test.py::test_gettext_html[lazy] - AssertionError: assert '__html__' in 'This is a <em>string</em> with a variable & more'
E       AssertionError: assert '__html__' in 'This is a <em>string</em> with a variable & more'
FAILED lazystr_test.py::test_gettext_html[markup] - AssertionError: assert '__html__' in Markup('This is a <em>string</em> with a variable &amp; more')
E       AssertionError: assert '__html__' in Markup('This is a <em>string</em> with a variable &amp; more')
=== 5 failed, 4 passed ===