twigphp / Twig

Twig, the flexible, fast, and secure template language for PHP
https://twig.symfony.com/
BSD 3-Clause "New" or "Revised" License
8.18k stars 1.25k forks source link

Performance issues when escaping looooong value as `html_attr` #4322

Closed Kocal closed 2 months ago

Kocal commented 2 months ago

Hi everyone, sorry if this issue is a duplicate, but I didn't find anything here.

For reference: https://github.com/symfony/ux/pull/2178

With Symfony UX Map, when you create a Map with 1000 Marker and InfoWindow, the following code takes ~62% of the response time (near ~220 ms) to escape $string (here, a big JSON object which have been stringified): https://github.com/twigphp/Twig/blob/126b2c97818dbff0cdf3fbfc881aedb3d40aae72/src/Runtime/EscaperRuntime.php#L240-L303

I.. don't know if its more an issue from Twig or a skill-issue from our StimulusAttributes usage.

But do you think something is optimizable here? is it safe to replace 'a very long stringified json object'|e('html_attr') with htmlentities(json_encode('...')? Because this combo improved performances by 62%!

Thanks!

stof commented 2 months ago

inside quoted attribute values, it is safe to use the html strategy.

The html_attr strategy is specifically about the extra rules needed when escaping things in attribute names and unquoted values.

Kocal commented 2 months ago

Indeed, I've reverted my code and used html as escaping strategy, the performances are a lot better!

image

Thanks @stof!

javiereguiluz commented 2 months ago

@stof could you please explain a bit more what do you mean by "unquoted values"? Maybe show an example?

The Twig docs (https://twig.symfony.com/doc/3.x/filters/escape.html) say this:

html_attr: escapes a string for the HTML attribute context.

stof commented 2 months ago

@javiereguiluz HTML does not force you to put quotes around values of attributes. <input type=text> is valid HTML. When you don't have quotes around the value, it requires a lot more escaping (for instance, any whitespace has to be escaped so that it does not end the attribute value).

The attribute context is the context where you write those attributes.

Looking at the HTML parsing state, it has several state related to parsing attribute pairs: https://html.spec.whatwg.org/#before-attribute-name-state (sections 13.2.5.32 to 13.2.5.39) As you can see there, the Attribute value (double-quoted) state and Attribute value (single-quoted) state states have very few special characters (basically, only & and the corresponding quote), which are already escaped by the html strategy of Twig. Other attribute-related states (related to attribute names and unquoted attribute values) have much more special characters, which is what the html_attr escaping strategy is about

fabpot commented 2 months ago

We should maybe clarify the docs? Who wants to give it a try?

javiereguiluz commented 2 months ago

Thanks a lot for the details @stof 🙏

Kocal commented 2 months ago

Thanks Google, in fact that's a duplicate of #2615.

I will try to send a doc PR asap :)

EDIT: opened #4324