Allow unescaped backticks - Githubissues

rickj commented 13 years ago

rickj created Redmine issue ID 2805

I have a very specific question to do with the modxParser class and backticks.

Backticks within a tag's property values must be escaped with another backtick. This is true if the backtick is visibly within a property like this:

// Example #1
[[$mychunk? &prop_name=`property `` value`]]

Also true for backticks within any content called by a nested tag within a property value:

// Example #2
[[$mychunk? &prop_name=`property [[$backtick_chunk]] value`]]
// Where backtick_chunk looks like this:
escaped backtick example: ``

I noticed that the parser expands the properties (evaluates all nested tags within a tag) before parsing out all the properties:

// In modParser::processTag()
$this->processElementTags($outerTag, $innerTag, true);
$outerTag= '[[' . $innerTag . ']]';

However, later the parser extracts the property values, then checks for and evaluates any nested tags again. If I remove the 2 lines above, I can have unescaped backticks in included content (the backticks within backtick_chunk of example 2 above wouldn't need to be escaped).

So here is my question: couldn't the parser skip the evaluation of nested within properties on the first go?

I'm almost positive there is a good reason why nested tags are expanded before properties are parsed, but I just can't figure it out. Thanks for reading.

rickj commented 13 years ago

rickj submitted:

I installed the sample site and although most things work without the processElementTags() in modParser::processTag(), some things don't (comments don't display and links for blog "tags" don't work).

Any idea why?

rickj commented 13 years ago

rickj submitted:

With further experimentation, I had success in allowing unescaped backticks in included content.

In modParser::processTag(), I changed the following lines:

$this->processElementTags($outerTag, $innerTag, true);
$outerTag= '[[' . $innerTag . ']]';
$tagParts= xPDO :: escSplit('?', $innerTag, '`', 2);
$tagName= trim($tagParts[0]);
$tagPropString= null;
if (isset ($tagParts[1])) {
    $tagPropString= trim($tagParts[1]);
}

to this:

$tagParts= xPDO :: escSplit('?', $innerTag, '`', 2);
$tagName= trim($tagParts[0]);
$tagPropString= null;
if (isset ($tagParts[1])) {
    $tagPropString= trim($tagParts[1]);
    // Get property name and value associative array
    $properties = $this->parsePropertyString($tagPropString, true);
    foreach ($properties as $propName => $propValue)
    {
        $origPropValue = $propValue;
        $this->processElementTags($origPropValue, $propValue, true);
        $properties[$propName] = $propValue;
    }
    // TODO: rename $tagPropString since it's actually an array now.
    $tagPropString = $properties;
}
$this->processElementTags($outerTag, $innerTag, true);
$outerTag= '[[' . $innerTag . ']]';

The key is to parse out the individual properties from the property string before expanding (parsing nested tags). Otherwise, included backticks make the property string unparseable because it's not possible to differentiate between a property name and value.

However, this fix isn't complete. The tag is still expanded complete with potentially unescaped backticks in content because the tag is needed for things like a cache key. A complete fix would require a rethink/rewrite of existing parser code.

My vision is to have a second flexible parser syntax where tag delimiters, element type names (tokens), property delimiters, and escape characters are configurable. A tag like this would be possible:

{chunk:myChunk propertyName="{chunk:linkText}"}

Essentially, I'd have an expressionengine syntax clone, but without all the painful limitations and inconsistencies of a non-recursive parser.

modxbot / migrate

Allow unescaped backticks #2805