Shopify / liquid

Liquid markup language. Safe, customer facing template language for flexible web apps.
https://shopify.github.io/liquid/
MIT License
11.13k stars 1.39k forks source link

check template UTF8 validity before parsing #1774

Closed ggmichaelgo closed 10 months ago

ggmichaelgo commented 10 months ago

What are you trying to solve?

Liquid relies on regex for parsing, and Liquid needs to check the template's String encoding validity.

Currently, Liquid is raising a ArgumentError when a template includes an invalid UTF8 byte sequence.

require 'liquid'
Liquid::Template.parse("{% assign foo = '\xC0' %}") 

# lib/liquid/tokenizer.rb:31:in `split': invalid byte sequence in UTF-8 (ArgumentError)

Instead of throwing the ArgumentError, this PR updates Liquid to raise a Syntax error when a template has a invalid encoding.

With this change, the developers won't have to catch the invalid encoding error like this:

begin
  Liquid::Template.parse("\xC0")
rescue ArugmentError => e
  if e.message == "invalid byte sequence in UTF-8"
     ...
  else
    raise e
  end
end

Instead, the error can be handled like this

begin
  Liquid::Template.parse("\xC0")
rescue Liquid::Error
     ...
end