Shopify / liquid-c

Liquid performance extension in C.
MIT License
120 stars 25 forks source link

Fix Liquid::C::Tokenizer compatibility for liquid tag #117

Closed dylanahsmith closed 4 years ago

dylanahsmith commented 4 years ago

cc @wizardlyhel who pointed out the problem to me

Problem

Liquid::Tokenizer and Liquid::C::Tokenizer differed in how they tokenized liquid tags. It was parsing to a single token per-tag, including newlines. However, Liquid::Tokenizer just used @source.split("\n") which omitted newlines and parsed to a single token per-line.

The newline character was preventing the Liquid::BlockBody::LiquidTagToken regex from matching, since . doesn't match the newline without it being a multiline regex, resulting in syntax errors when using the disable_liquid_c_nodes: true or profile: true parse option.

For example, parsing the following liquid tag with disable_liquid_c_nodes

      {%- liquid
        assign x = 1
        assign y = x | plus: 2
        echo y
      -%}

would result in the following syntax error

Liquid::SyntaxError: Liquid syntax error: Unknown tag 'assign x = 1
  '

Solution

Make Liquid::C::Tokenizer compatible with Liquid::Tokenizer by preserving the blank lines, but using a new token type for the C parsing code to easily ignore it.