TYPO3 / Fluid

Fluid template rendering engine - Standalone version
GNU Lesser General Public License v3.0
152 stars 92 forks source link

Throw exception for unresolved namespaces #8

Closed bwaidelich closed 8 years ago

bwaidelich commented 9 years ago

Unknown namespaces used to be ignored in Fluid. But that lead to serious issues, e.g. when extracting template code into a partial while forgetting the namespace imports. S.th. like

<media:image image="{image}" />

Would try to render '{image}directly if themedia:image` ViewHelper was ignored, leading to various errors that are very hard to debug.

With Flow 3.0 we therefore introduced a stricter handling of unknown namespaces, see: https://jira.neos.io/browse/FLOW-150

<unknownNamespace:foo />

Expected output

Error while rendering a ViewHelper
The namespace of ViewHelper notation "<unknownNamespace:foo.../>" could not be resolved.

Possible reasons are:
* you have a spelling error in the viewHelper namespace
* you forgot to import the namespace using "{namespace unknownNamespace=Some\Package\ViewHelpers}"
* you're trying to use a non-fluid xml namespace, in which case you can use "{namespace unknownNamespace}" to ignore this namespace for fluid rendering

Actual output

<unknownNamespace:foo />
NamelessCoder commented 9 years ago

Confirmed: they are currently ignored and output as template content. I believe we can throw an exception about this in ViewHelperResolver, e.g. in each adapter package that requires the extra strictness - but I am not convinced it should be in the base package (although I acknowledge the potential problems in outputting {object} as string if <badns:vh object="{object}" />).

There is a pendant from this circumstance to handling of non-ViewHelper namespaces declared on xmlns attributes: currently, those are not recorded - they would have to be in order to silently ignore such namespaces. This in turn has a pendant to the TemplateParserPatterns which, if I recall it correctly, only recognises namespaces if they use the predefined ^http://typo3.org/ns/.ViewHelpers$ format (forgive missing escapes). It's fairly easy to introduce as a template pre-processor and added methods on adapters' ViewHelperResolver classes to "add ignored namespace".

bwaidelich commented 9 years ago

If you look at the description (and possibly implementation) of https://jira.neos.io/browse/FLOW-150 you can see that there is more to it than just detecting unknown namespaces. i.e. we allow to ignore certain namespace patterns explicitly using

{namespace foo*}

I doubt that this can be re-implemented in the adapter package via ViewHelperResolver!?

NamelessCoder commented 9 years ago

It sticks deeper still. According to the issue, the Exception behavior also depends on Production/Development context of which Fluid knows nothing and imho should know nothing. This is a clear indication that yes, it should be in the adapter package.

{namespace foo*} I doubt that this can be re-implemented in the adapter package via ViewHelperResolver!?

It absolutely can. A TemplatePreProcessor would catch the namespace and it would have access to the ViewHelperResolver of the type used in the adapter package. This class can then have any number of additional and overridden methods accepting any amount of custom namespace registration syntax desired.

A similar TemplatePreProcessor is implemented in TYPO3's adapter package to allow legacy {namespace foo=Path\To\ViewHelpers} registration. It extracts this and adds it to the ViewHelperResolver which then "knows" about the namespace.

bwaidelich commented 9 years ago

It sticks deeper still. According to the issue, the Exception behavior also depends on Production/Development context [...]

That's a misunderstanding, the behavior of the rendering is different. There's nothing about the Application context in the actual Fluid change (https://review.typo3.org/30673).

It absolutely can. A TemplatePreProcessor would catch the namespace and it would have access to the ViewHelperResolver of the type used in the adapter package

OK great, someone need to look into that then - I'll try if I find the time to do so

NamelessCoder commented 9 years ago

OK great, someone need to look into that then - I'll try if I find the time to do so

Let me know if you need more background or some rough draft code. I think we should also summon @mneuhaus here as he is the author of the original patch for Fluid and may have some valuable input.

NamelessCoder commented 9 years ago

That's a misunderstanding, the behavior of the rendering is different. There's nothing about the Application context in the actual Fluid change (https://review.typo3.org/30673).

Ah, right - my bad. It's all caught and handled/not-handled "from the outside", not by switching exception types.

mneuhaus commented 9 years ago

mneuhaus has been summoned and will take a peek tonight!

mneuhaus commented 9 years ago

In general i think we should provide this feature in fluid itself, because it's really quite a common pitfall that's not really obvious to newcomers. We could use it as a default Processor, which you can easily unset/override/ignore i.

i guess it could look something like this:

<?php
namespace TYPO3Fluid\Fluid\Core\Parser\TemplateProcessor;

use TYPO3Fluid\Fluid\Core\Parser\Patterns;
use TYPO3Fluid\Fluid\Core\Parser\TemplateParser;
use TYPO3Fluid\Fluid\Core\Parser\TemplateProcessorInterface;
use TYPO3Fluid\Fluid\Core\ViewHelper\ViewHelperResolver;

/*
 * This file belongs to the package "TYPO3 Fluid".
 * See LICENSE.txt that was shipped with this package.
 */

/**
 *
 */
class UnknownNamespaceDetectionTemplateProcessor implements TemplateProcessorInterface {

    /**
     * @var TemplateParser
     */
    protected $templateParser;

    /**
     * @var ViewHelperResolver
     */
    protected $viewHelperResolver;

    /**
     * @var string
     */
    protected $namespacePattern = '/(?<!\\\\){namespace\s*(?P<identifier>[a-zA-Z\*]+[a-zA-Z0-9\.\*]*)\s*(=\s*(?P<phpNamespace>(?:[A-Za-z0-9\.]+|Tx)(?:\\\\\w+)+)\s*)?}/m';

    /**
     * @var array()
     */
    protected $localNamespaces = array();

    /**
     * Setter for passing the TemplateParser instance
     * that is currently processing the template.
     *
     * @param TemplateParser $templateParser
     * @return void
     */
    public function setTemplateParser(TemplateParser $templateParser) {
        $this->templateParser = $templateParser;
    }

    /**
     * Setter for passing the ViewHelperResolver instance
     * being used by the TemplateParser to resolve classes
     * and namespaces of ViewHelpers.
     *
     * @param ViewHelperResolver $viewHelperResolver
     * @return void
     */
    public function setViewHelperResolver(ViewHelperResolver $viewHelperResolver) {
        $this->viewHelperResolver = $viewHelperResolver;
    }

    /**
     * Pre-process the template source before it is
     * returned to the TemplateParser or passed to
     * the next TemplateProcessorInterface instance.
     *
     * @param string $templateSource
     * @return string
     */
    public function preProcessSource($templateSource) {
        $this->parseLocalNamespaces($templateSource);

        preg_match_all(Patterns::$SPLIT_PATTERN_TEMPLATE_DYNAMICTAGS, $templateSource, $splitParts);
        foreach ($splitParts[0] as $viewHelper) {
            preg_match_all(Patterns::$SCAN_PATTERN_TEMPLATE_VIEWHELPERTAG, $viewHelper, $matches);
            foreach ($matches['NamespaceIdentifier'] as $key => $namespace) {
                if (!$this->isValidNamespace($namespace)) {
                    throw new \Exception('Unkown Namespace: ' . htmlspecialchars($matches[0][$key]));
                }
            }
        }

        preg_match_all(Patterns::$SPLIT_PATTERN_SHORTHANDSYNTAX_VIEWHELPER, $templateSource, $shorthandViewHelpers);
        foreach ($shorthandViewHelpers['NamespaceIdentifier'] as $key => $namespace) {
            if (!$this->isValidNamespace($namespace)) {
                throw new \Exception('Unkown Namespace: ' . $shorthandViewHelpers[0][$key]);
            }
        }
        return $templateSource;
    }

    public function isValidNamespace($namespace) {
        $registerdNamespaces = $this->viewHelperResolver->getNamespaces();
        if (isset($registerdNamespaces[$namespace])) {
            return TRUE;
        }
        foreach ($this->localNamespaces as $localNamespace) {
            $pattern = '/' . str_replace(array('.', '*'), array('\\.', '[a-zA-Z0-9\.]*'), $localNamespace) . '/';
            if (preg_match($pattern, $namespace) === 1) {
                return TRUE;
            }
        }
        return FALSE;
    }

    public function parseLocalNamespaces($templateSource) {
        preg_match_all($this->namespacePattern, $templateSource, $localNamespaces);
        foreach ($localNamespaces['identifier'] as $localNamespaces) {
            $this->localNamespaces[] = $localNamespaces;
        }
    }

    public function addIgnoredNamespace($namespace) {
        $this->ignoredNamespace[] = $namespace;
    }
}

and used like this:

<?php

    $processor = new \TYPO3Fluid\Fluid\Core\Parser\TemplateProcessor\UnknownNamespaceDetectionTemplateProcessor();
    $view->setTemplateProcessors(array(
        $processor
    ));
NamelessCoder commented 9 years ago

@mneuhaus This looks spot on to me - it is exactly what I had in mind for this.

I think all this needs is a few checks ($splitParts[0] must be array, etc) and the missing doc comments.

bwaidelich commented 8 years ago

Unfortunately this is not completely fixed with #17 (see issue comment)

NamelessCoder commented 8 years ago

@mneuhaus You did lift the expression detecting these, directly from Flow - right? Maybe the CDATA protection happened elsewhere and now also needs to happen here.

mneuhaus commented 8 years ago

@NamelessCoder yep, used it without modification. i'll take peek tonight when/where that cdata got filtered out there.

bwaidelich commented 8 years ago

@mneuhaus could you have another look at this at some point? #27 did not do the trick unfortunately

mneuhaus commented 8 years ago

@bwaidelich hey, yes i'll take another peak tonight :)

mneuhaus commented 8 years ago

so, the issue turned out to be, that the cdata pattern was to specific, it started with ^ and ended with $.

i added a new Pattern STRIP_PATTERN_CDATA that doesn't have that limitation :)

bwaidelich commented 8 years ago

@mneuhaus great, thanks for taking care. But are you sure that this works reliably, e.g. with multiple and/or nested CDATA sections?

mneuhaus commented 8 years ago

@bwaidelich i just added another testcase with 2 cdata blocks and a namespace registration in before, between and after, seems to work as expected. The regex inside the cdata-catcher is ungreedy, so it should be fine, altough you can't of course guarantee anything with regexes ;D

https://github.com/TYPO3Fluid/Fluid/commit/13149ae6befd10cb14cc0374757fdf5adbd0af94#diff-1dd205376bc362161e9427170328b8c1R665

do you have any other idea/suggestion for additional testcases i should include?

mneuhaus commented 8 years ago

doh, you're right about the nested ones :/

this fails:

<![CDATA[
    {nonExistingNamespace:beforeNested()}
    <![CDATA[{nonExistingNamespace:nested()}]]>
    {nonExistingNamespace:afterNested()}
]]>
mneuhaus commented 8 years ago

phew, this is a doozy :(

i've been struggling to get a regex working properly for that case through a combination of lookahead/lookbehind, recursive patterns, etc. all fail in some usecases or just don't work, because the "enclosure" isn't a single character like ( .. ) but \<\!\[CDATA\[ .. \]\]\>

i could to a simple preg_split with balancing like this (START ... END instead of cdata for readability) to do the filtering, but this will fail as soon as there is an imbalance inside the cdata (one to many open or close)

<?php 

$text = '
{namespace before}
START(
    {incdata:beforeNested()}
    START{inCdata:nested()}END
    {incdata:afterNested()}
END
{namespace outside}
START{nonExistingNamespace:baz()}END
{namespace outsidebetween2cdata}
START{nonExistingNamespace:baz()}END
{namespace outsideafter}
';

$parts = preg_split('/(START|END)/', $text, -1, PREG_SPLIT_DELIM_CAPTURE);
$balance = 0;
foreach ($parts as $index => $part) {
    if ($part === 'START') {
        $balance++;
    }
    if ($balance > 0) {
        unset($parts[$index]);
    }
    if ($part === 'END') {
        $balance--;
    }
}

echo implode('', $parts);

All in all this is really extremely messy to deal with :/ Further more it also goes against the way CDATA should be used https://en.wikipedia.org/wiki/CDATA#Nesting

I'm quite unsure about how to do this, currently i'd go for the simple balancing version and "ignore" the fact that not opened/closed nested cdata would wreck havoc. The regex version would suffer similiar problems as wenn i think

bwaidelich commented 8 years ago

mh, regular expressions at its best ;) In the TYPO3.Fluid package the parser splits the template on CDATA sections (https://github.com/neos/flow-development-collection/blob/master/TYPO3.Fluid/Classes/TYPO3/Fluid/Core/Parser/TemplateParser.php#L46-L50) and then again when processing the splitted parts (https://github.com/neos/flow-development-collection/blob/master/TYPO3.Fluid/Classes/TYPO3/Fluid/Core/Parser/TemplateParser.php#L531) Maybe that's an option?

mneuhaus commented 8 years ago

Hey @bwaidelich :)

yes we could reproduce that same logic, but it would to the same as my balancing foreach. i just took a bit of a deeper dive into the TemplateParser to find out how exactly it currently handles namespace declarations and viewhelpers inside cdata an especially inside nested cdata.

tl;dr;

So, reproducing the old functionality, would basicall reindtroduce "buggy" behavior regarding nested cdata tags, i'd vote in favor of the "balancing" foreach loop to handle the cdata/namespace stuff, because it does work as expected as long as your cdata are valid and not imbalanced opened/closed.

Details:

i tested by using the latest base-distribution and replace the default Welcome Index with :

simple cdata
{namespace format=TYPO3\Fluid\ViewHelpers\Format}
-------------------------------------------------------------------------------
|   <![CDATA[
|       {namespace a=Foo\A\ViewHelpers}
|       {namespace format=TYPO3\Fluid\ViewHelpers\Format}
|   ]]>
<format:raw>foobar</format:raw>
-------------------------------------------------------------------------------
<![CDATA[ reset ]]><![CDATA[ reset ]]>

proper balanced nested cdata
-------------------------------------------------------------------------------
|   <![CDATA[
|       {namespace b=Foo\B\ViewHelpers}
|       <![CDATA[
|       {namespace c=Foo\C\ViewHelpers}
|       ]]>
|       {namespace d=Foo\D\ViewHelpers}
|   ]]>
-------------------------------------------------------------------------------
<![CDATA[ reset ]]><![CDATA[ reset ]]>

unclosed inner cdata
-------------------------------------------------------------------------------
|   <![CDATA[
|       {namespace e=Foo\E\ViewHelpers}
|       <![CDATA[
|       {namespace g=Foo\G\ViewHelpers}
|
|       {namespace h=Foo\H\ViewHelpers}
|   ]]>
-------------------------------------------------------------------------------
<![CDATA[ reset ]]><![CDATA[ reset ]]>

missing inner open cdata tag
-------------------------------------------------------------------------------
|   <![CDATA[
|       {namespace i=Foo\I\ViewHelpers}
|
|       {namespace j=Foo\J\ViewHelpers}
|       ]]>
|       {namespace k=Foo\K\ViewHelpers}
|   ]]>
-------------------------------------------------------------------------------
<![CDATA[ reset ]]><![CDATA[ reset ]]>

raw viewHelper inside cdata
-------------------------------------------------------------------------------
|   <![CDATA[
|       before nested cdata: <f:format.raw>asiodaoijsd</f:format.raw>
|       <![CDATA[
|           inside nested cdata<f:format.raw>asiodaoijsd</f:format.raw>
|       ]]>
|       after nested cdata<f:format.raw>asiodaoijsd</f:format.raw>
|   ]]>
-------------------------------------------------------------------------------
<![CDATA[ reset ]]><![CDATA[ reset ]]>

The resulting output is:

simple cdata

-------------------------------------------------------------------------------
|   
|       
|       
|   
foobar
-------------------------------------------------------------------------------
 reset  reset 

proper balanced nested cdata
-------------------------------------------------------------------------------
|   
|       
|       <![CDATA[
|       
|       
|       
|   ]]>
-------------------------------------------------------------------------------
 reset  reset 

unclosed inner cdata
-------------------------------------------------------------------------------
|   
|       
|       <![CDATA[
|       
|
|       
|   
-------------------------------------------------------------------------------
 reset  reset 

missing inner open cdata tag
-------------------------------------------------------------------------------
|   
|       
|
|       
|       
|       
|   ]]>
-------------------------------------------------------------------------------
 reset  reset 

raw viewHelper inside cdata
-------------------------------------------------------------------------------
|   
|       before nested cdata: <f:format.raw>asiodaoijsd</f:format.raw>
|       <![CDATA[
|           inside nested cdata<f:format.raw>asiodaoijsd</f:format.raw>
|       
|       after nested cdataasiodaoijsd
|   ]]>
-------------------------------------------------------------------------------
 reset  reset 

additionally i did a print_r of the registered namespaces inside the TemplateParser after evaluating the template, which resulted in these registered namespaces:

Namespaces: Array
(
    [f] => TYPO3\Fluid\ViewHelpers
    // ... default package namespaces ...
    [format] => TYPO3\Fluid\ViewHelpers\Format
    [a] => Foo\A\ViewHelpers
    [b] => Foo\B\ViewHelpers
    [c] => Foo\C\ViewHelpers
    [d] => Foo\D\ViewHelpers
    [e] => Foo\E\ViewHelpers
    [g] => Foo\G\ViewHelpers
    [h] => Foo\H\ViewHelpers
    [i] => Foo\I\ViewHelpers
    [j] => Foo\J\ViewHelpers
    [k] => Foo\K\ViewHelpers
)
bwaidelich commented 8 years ago

Wow, thanks for looking into that one and for the detailed report! I think there was a misunderstanding though: IMO it is not necessary to support CDATA sections in the "metadata"-section (that is {namespace ..} declarations) within a template. Do you think CDATA nesting is not properly supported and can you provide a snippet that demonstrates this?

bwaidelich commented 8 years ago

OK, after talking to Marc again here is where TYPO3.Fluid is inconsistent in regards to CDATA sections: Given:

<![CDATA[
    one
    <![CDATA[
        two
    ]]>
    three
]]>

Expected output:

Actual Output:

    one
    <![CDATA[
        two

    three
]]>

For me the most important thing is that the behavior stays compatible from TYPO3.Fluid to the standalone version (compatible as in similar or more correct).

bwaidelich commented 8 years ago

FYI: Little update on this issue: So according to Marc the current CDATA behavior is broken in TYPO3.Fluid already and I can confirm that nested CDATA sections don't work as expected. But IMO that can be fixed in a separate change sometimes in the future.

But the critical difference to the current behavior is that

<![CDATA[
    {non.existing.namespace:bar()}
]]>

now throws an UnknownNamespaceException!

BTW: It took me a while but I finally fixed and updated my testing server, so you can see the error in action: http://134.119.11.221/flowpack.viewhelpertest/standard/index?selectedPartial=_parser_

bwaidelich commented 8 years ago

Hi,

some more feedback on this one: Can someone quickly test whether the code above still throws an exception? Because IMO that would be a nogo. CDATA sections are meant to be completely ignored by the parser and changing that would break a lot of templates out there.

PS: Unfortunately I had to shut down my test instance, but hopefully we manage to set one up again and adjust my fork to the latest version of Flow & TYPO3Fluid to see how many tests of the Viewhelpertest package still fail.

mneuhaus commented 8 years ago

I just test this, the following code does not throw an exception in the current version.

<![CDATA[
    {non.existing.namespace:bar()}
]]>
kitsunet commented 8 years ago

Yep I can confirm this. Although I was also not really happy with the fact htat now CDATA sections disappear entirely from the rendered result (as they are basically replaced with empty string in \TYPO3Fluid\Fluid\Core\Parser\TemplateProcessor\NamespaceDetectionTemplateProcessor

I changed that in my adaptor package with a little trick that works quite well ;)

I base64 encode the entire (balanced) CDATA section and wrap that base64 string with a base64decode VH. That completely masks the CDATA section from the Parser and everything, but retains the CDATA section in the final output.

NamelessCoder commented 8 years ago

@kitsunet It seems the skipped stripping of CDATA was in fact buggy behavior in the original. It's buried in this thread but the couple of comments following https://github.com/TYPO3Fluid/Fluid/issues/8#issuecomment-148704015 should explain what was done and why :)

I do have an alternative suggestion how you might be able to improve the CDATA protection you do:

You may want to nest these in an array in the variable provider. The core of the matter is that this way likely performs better than base64. I assume even if this "expect CDATA to be output" behavior gets deprecated, the support has to remain in place for a long time, so performance should probably be considered.

Caveat: beware of https://forge.typo3.org/issues/76276 which is still in effect.

EDIT: Other caveat, may not work correctly once the template is compiled. If caches are enabled for the adapter or implementation then a method which somehow preserves the CDATA as part of the template itself would be preferred, like the above with base64.

Whether or not this issue can be closed (use case in issue is solved?) I'll let you guys decide.

mneuhaus commented 8 years ago

from my end this could be closed i guess.

kitsunet commented 8 years ago

Btw @NamelessCoder the idea with the variable provider is nice but I need a way to have the cdata content encoded in the compiled template otherwise I don't see how that woudl work on consecutive renders of this template....