getgrav / grav

Modern, Crazy Fast, Ridiculously Easy and Amazingly Powerful Flat-File CMS powered by PHP, Markdown, Twig, and Symfony
https://getgrav.org
MIT License
14.49k stars 1.4k forks source link

1.7.17 -> 1.7.19: Breaking change: Twig outputting html requires filter | raw #3091

Closed pamtbaau closed 3 years ago

pamtbaau commented 3 years ago

After upgrading to Grav 1.7.19 outputting html strings in Twig now requires filter | raw

Tested on: PHP 7.4.11 and PHP 8.0.0

Eg.

rhukster commented 3 years ago

Yes, this is something we've decided to enable by default due to a number of XSS issues that have been reported. We decided to enable auto-escaping by default before we release 1.7 as it's an important safety feature.

The upgrade guide will soon be updated with this.

BTW, you can easily revert to the old approach (no |raw needed) by enabling Twig Compatibility in system.yaml. Upgrading has some smart 'post-flight' logic to toggle this if you are upgrading from 1.6 where you probably already had this enabled.

We're trying to make as easy to upgrade as possible (without breaking), but ensuring that new sites are using the more secure approach.

pamtbaau commented 3 years ago

Does this mean I have to upgrade all my plugins and themes?

rhukster commented 3 years ago

Sorry see the previous comment, already added to it.

pamtbaau commented 3 years ago

Upgrading plugin/themes is not that much of a problem for me since I only use home-grown.

Maybe an idea to warn people on forehand in CHANGELOG.md and elsewhere before upgrading to 1.7.19?

rhukster commented 3 years ago

yes it should of been in the changelog.. not sure how it was missed.

pamtbaau commented 3 years ago

BTW, markdown page does not render html either.

E.g: # Title -> <h1>Title</h1>

mahagr commented 3 years ago

If you're upgrading your site, the update should automatically keep the old settings (or append them to system.yaml file). So the change only affects the new sites and even then you're still allowed to just use the old settings.

But yes, we have told about this change already a couple of years ago, but because of a badly written theme was used in an example to upload and run a script in the server, we just decided to put end to the bad practice of people never escaping anything.

You should only use |raw filter when you know that your input is safe HTML, such as markdown output. It is a lot easier though than to find every single variable output and escape all of them. Grav 2.0 will remove the possibility not to escape output (meaning that you need |raw), so it's better to start preparing now that there is still a lot of time to make the necessary changes.

mahagr commented 3 years ago

Here are Grav 1.6 backward compatible configuration options in case if they are needed:

twig:
  autoescape: false 
strict_mode:
  yaml_compat: true
  twig_compat: true
  blueprint_compat: true
w00fz commented 3 years ago

BTW, markdown page does not render html either.

E.g: # Title -> <h1>Title</h1>

Is this a particular theme doing it (antimatter?), a plugin’s template or what exactly is not rendering what you have there?

pamtbaau commented 3 years ago

It's not a theme build by the Grav team, but instead a home-grown theme that uses {{ page.content }} instead of {{ page.content | raw }}.

I wonder how many published themes/plugins output html and do not use filter | raw...

By the way, is there some plan to audit existing themes/plugins for any 1.7 upgrade issues? Any way I can help with that?

w00fz commented 3 years ago

We did a lot of in-house cleaning already but things I’m sure have been missed. I had thought about somehow scanning the themes and plugins in the GPM but the inconvenient truth is that there is no automated tool that will ever be able to predict what the intention of a variable is supposed to be: escaped or raw. However scanning for at least page.contentand ensuring the raw filter is present might not be a bad idea.

Any help in bringing awareness to this change and opening PRs on contributors projects (plugins/themes) is more than welcome.

We are in this situation and we need to move grav to autoescape by default for the better good. It won’t happen smoothly, unfortunately, but we have many safe guards in place, including preserving the old behavior if you are updating grav, as opposed to introducing the new behavior if you are starting fresh, and be able to switch to the old behavior if you just can’t deal with any of this for a project.

pamtbaau commented 3 years ago

including preserving the old behavior if you are updating grav

I'm in the habit of always replacing an existing Grav install with a fresh download instead of using bin/gpm selfupgrade. That's why I bumped into the issue.

I had thought about somehow scanning the themes and plugins in the GPM [..] scanning for at least page.content

That was exactly my idea... Maybe also adding {{ image.html|raw }}, {{ assets.css()|raw }}, {{ assets.js()|raw }}, ...

But also considering the quality/dependability on published themes/plugins in general. If a theme/plugin causes issues, it will reflect on the perceived stability/dependability/maturity of Grav itself.

mahagr commented 3 years ago

I'm in the habit of always replacing an existing Grav install with a fresh download instead of using bin/gpm selfupgrade. That's why I bumped into the issue.

This is why I introduced a new file user/config/versions.yaml, which will fill up any versions you're installed. Unfortunately as it is a new file, there is no way to know if you had a new install or upgrade until the file exists. This is why I also introduced the file in the latest 1.6 release. The idea of this file is to detect if the version has changed and to allow users to run post-installation scripts when needed. There will also be a manual way to run those scripts, but it didn't yet make into the release because we added this one pretty late of the release cycle.

pamtbaau commented 3 years ago

Ah, so that what the suddenly appearing version.yaml file is for... :-)

Btw, maybe an update of my last post on themes/plugins crossed your reply.

mahagr commented 3 years ago
* How could all plugins/themes be audited for code quality

I have already implemented tests that check the plugins against broken code and deprecated method calls. In addition, we have yaml linter, which can be used to check the validity of any yaml files with a more strict parser. Same can be done with twig to check if they use any deprecated or removed features -- but unfortunately those tests do not catch the raw issues in the templates.

* Could there be a procedure in place the forces theme/plugin publisher to update their code?

The easiest way is to check if there have been updates. If something hasn't been updated for a long time, the plugin/theme could have some warning about it being old. In the future, there could also be a way to mark compatibility from GPM itself. Also, I have in my mind to have automatic tests for the plugins and themes which would at least catch any fatal errors because of used removed features...

What comes to versions.yaml file, it should also contain installed themes and plugins. It could really be used to replace .dependencies file and allow automatic installation of any missing extensions (great if you have a repo that you cloned).

mahagr commented 3 years ago

Documentation has been updated to warn people about the change. Also I updated examples earlier to have both |e and |raw used in the variables as I didn't want to create 1.7 only examples.

pamtbaau commented 3 years ago

While you're at it... https://learn.getgrav.org/17/themes/theme-tutorial#step-4-base-template

{% extends 'partials/base.html.twig' %}

{% block content %}
    {{ page.content }}
{% endblock %}

Question: What data do you (the team) consider as safe?

mahagr commented 3 years ago

Fixed a few more examples like that.

I would put it in another way: everything that comes from something admin and users can edit/submit (pages, any configuration options, data) and anything that is passed by URL/GET/POST options in HTTP should be considered to be unsafe even if it contains HTML. This is because of there is always possibility that an attacker gets priviledges to modify content of the site. Thus even if those inputs accept HTML, they should be filtered when they are rendered. Content is markdown converted to HTML and as such considered to be already filtered and safe.

But even if we don't consider attacks, every input which is not explicitly specified to contain HTML (or JS/CSS depending on the context) should always be escaped. There cannot be any fields which can accept both or assume that people write HTML without explicitly mentioning that the field accepts only HTML.

I want to introduce |html filter, which can be used with user inputs that are HTML.

cnfolio commented 3 years ago

Have set the configuration options suggested by @mahagr for backwards compatibility, but, the page select form component is still showing drop down menu items with extra characters, e.g. each page in the drop down list is prefixed with the "—-▸" extra characters. As this is done within the core Grav code, I can't see a way for me to add the raw filter to it. Any suggestions on how to workaround or fix this issue? Thanks in advance.

cnfolio commented 3 years ago

github comments converted the html entity codes into the display characters in my comment above. In Grav, it is displaying it as the escaped html entity strings.

pamtbaau commented 3 years ago

@cnfolio, That is a issue from Admin plugin and marked as bug: Html entities are not rendered properly #1990

cnfolio commented 3 years ago

thanks, will try the manual file edit described by @renards

pamtbaau commented 3 years ago

@mahagr, Thanks for your elaborate answer.

Content is markdown converted to HTML and as such considered to be already filtered and safe.

Is it? Markdown containing <a href="javascript:alert('I am evil')">Trust me</a> is parsed and working in the browser...

I tend to consider any input coming from HTTP as unsafe (also Admin plugin), but do trust data stored in configs when Admin cannot edit the config (no blueprint). Do you agree?

E.g. I'm using a home grown Google Tag Manager plugin that inserts the gtm js script + id into the page. That snippet is stored in config of plugin. Should the js script instead be stored hardcoded in Twig?

I want to introduce |html filter, which can be used with user inputs that are HTML.

Is this meant to strip al js scripts from HTML?

mahagr commented 3 years ago

@cnfolio Yes, it was a separate issue which we forgot to fix in Grav 1.6 and Admin 1.9 after a security fix. It was working in Grav 1.7 and Admin 1.10, so we missed that one.

@pamtbaau Ugh, you're right. There is a warning in admin and that's all.

And yes, I agree that application data has to be considered as safe, otherwise we would argue that even PHP can be uploaded and we cannot trust it. But I agree, you can trust what you wrote, but you cannot trust any input that originally came from a browser.

That said, if possible, I would still avoid using HTML everywhere else than markdown or template files.