zspecza / common-tags

🔖 Useful template literal tags for dealing with strings in ES2015+
Other
1.99k stars 60 forks source link

transform literals #106

Closed pizzafroide closed 6 years ago

pizzafroide commented 7 years ago

Rationale

It may sometimes be interesting to transform literals instead of substitutions. Take this snippet for example:

const result = `
  <head>
    <script>
      ${code}
    </script>
  </head>`

If I want to "minify" this code as with oneLineTrim but the code variable contains an already obfuscated and really long string, I'd like to transform the literals and let code as it is instead of transforming the whole resulting string.

I could have written:

const result = `${oneLineTrim`
  <head>
    <script>
      `}${code}${oneLineTrim`
    </script>
  </head>`}`

But this is not really readable. Instead, with these new functionalities, I can write a simple transformer like this:

const minify = new TemplateTag(
  replaceLiteralTransformer(/(?:\n\s+)/g, '')
)

And the result is much more readable:

const result = minify`
  <head>
    <script>
      ${code}
    </script>
  </head>`

Note: If code is 'foo\nbar', I obtain '<head><script>foo\nbar</script></head>'.

An idea for a future version

Using the new onLiteral transformer property, it could be possible creating "custom syntax" for placeholders. For example:

html`${foo} ${safeHtml`${bar}`}`

Could be rewritten as:

html(newSyntax)`${foo} !${bar}`

And the code of the newSyntax template tag would be something like:

const newSyntax = new TemplateTag({
  onLiteral(literal) {
    const match = /(^|.*[^\\])!$/g.match(literal)
    if (match)
      return { then: safeHtml, value: match[1] }
    return literal;
  }
})

Note: The new !${expression} syntax can be escaped using \.

If foo is '<span>foo</span>' and bar is '<span>bar</span>', the output is '<span>foo</span> &lt;span&gt;bar&lt;/span&gt;'.

But the following (quite dumb) code outputs '<span>foo</span> !<span>bar</span>'.

html(newSyntax)`${foo} \!${bar}`

This needs some changes in TemplateTag though: when onLiteral returns an object, substitutions need to be chained to already existing ones for transforming the next substitution and the returned value property should be passed to the next transformer.

codecov-io commented 7 years ago

Codecov Report

Merging #106 into master will not change coverage. The diff coverage is 100%.

Impacted file tree graph

@@          Coverage Diff          @@
##           master   #106   +/-   ##
=====================================
  Coverage     100%   100%           
=====================================
  Files          23     24    +1     
  Lines         153    160    +7     
=====================================
+ Hits          153    160    +7
Impacted Files Coverage Δ
src/TemplateTag/TemplateTag.js 100% <100%> (ø) :arrow_up:
...placeStringTransformer/replaceStringTransformer.js 100% <100%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update c6b0925...446229e. Read the comment docs.

zspecza commented 6 years ago

I really like this proposal:

html(newSyntax)`${foo} !${bar}`

However, I think the following has become a non-issue thanks to #132:

This needs some changes in TemplateTag though: when onLiteral returns an object, substitutions need to be chained to already existing ones for transforming the next substitution and the returned value property should be passed to the next transformer.

Any of the tags common-tags exports may now operate on regular strings thanks to the above mentioned PR. So, we can assumedly support this style:

const newSyntax = new TemplateTag({
  onLiteral(literal) {
    const match = /(^|.*[^\\])!$/g.match(literal)
    if (match) {
      return safeHtml(match[1]);
    }
    return literal;
  }
})

It really excites me how the different features of common-tags are being combined in this proposal to add something of real value here.

Thoughts, @fatfisz, @pizzafroide?

pizzafroide commented 6 years ago

So, we can assumedly support this style:

const newSyntax = new TemplateTag({
  onLiteral(literal) {
    const match = /(^|.*[^\\])!$/g.match(literal)
    if (match) {
      return safeHtml(match[1])
    }
    return literal
  }
})

Well, that's not exactly true. First, #132 only operates on the end result; in other words #132 works for oneLine but not for safeHtml (I'm not entirely sure if this should be considered a bug or not). So, instead of safeHtml(match[1]), it should be safeHtml`${match[1]}`. Secondly, match[1] is the literal, not the substitution that follows the literal, on which safeHtml is expected to be applied.

However, it is possible populating a "transformers queue" in onLiteral and consume it in onSubstitution. The queue can be stored in a (symbol) property of the current TemplateTag instance (aka this). This works because literals are transformed before substitutions. So, the final code would be something like this:

const TRANSFORMERS_QUEUE = Symbol('transformersQueue')

const newSyntax = new TemplateTag({
  onLiteral(literal) {
    this[TRANSFORMERS_QUEUE] = this[TRANSFORMERS_QUEUE] || []
    const match = /(^|.*[^\\])!$/g.exec(literal)
    if (match) {
      this[TRANSFORMERS_QUEUE].push(safeHtml)
      return match[1]
    }
    this[TRANSFORMERS_QUEUE].push(null)
    // The following check was missing in the previous example
    if (literal.endsWith('\\!')) {
      return literal.substr(0, literal.length - 2) + '!'
    }
    return literal
  },
  onSubstitution(substitution) {
    const currentTransformer = this[TRANSFORMERS_QUEUE].shift()
    if (currentTransformer) {
      return currentTransformer`${substitution}`
    }
    return substitution
  }
})

Which gives:

const foo = '<span>foo</span>'
const bar = '<span>bar</span>'
const baz = '<span>baz</span>'

html(newSyntax)`${foo} !${bar} ${baz}`
// => "<span>foo</span> &lt;span&gt;bar&lt;/span&gt; <span>baz</span>"

Note: The escape sequence is actually '\\!' instead of '\!' as mentioned earlier. So:

html(newSyntax)`${foo} \\!${bar} ${baz}`
// => "<span>foo</span> !<span>bar</span> <span>baz</span>"

A new built-in transformer

Since the code of such a transformer is a little complicated, maybe it would be interesting to provide a new built-in transformer such as the following:

const escapeStringRegexp = input => input.replace(/[|\\{}()[\]^$+*?.]/g, '\\$&')

const TRANSFORMERS_QUEUE = Symbol('transformersQueue')

const literalTransformer = (sequence, transformer) => {
  if (sequence.length === 0) {
    throw new Error('Literal transformer sequence should not be empty.')
  }
  return {
    onLiteral(literal) {
      this[TRANSFORMERS_QUEUE] = this[TRANSFORMERS_QUEUE] || []
      const re = new RegExp(`(^|.*[^\\\\])${escapeStringRegexp(sequence)}$`, 'g')
      const match = re.exec(literal)
      if (match) {
        this[TRANSFORMERS_QUEUE].push(transformer)
        return match[1]
      }
      this[TRANSFORMERS_QUEUE].push(null)
      if (literal.endsWith(`\\${sequence}`)) {
        return literal.substr(0, literal.length - sequence.length - 1) + sequence
      }
      return literal
    },
    onSubstitution(substitution, resultSoFar) {
      const currentTransformer = this[TRANSFORMERS_QUEUE].shift()
      if (currentTransformer) {
        return currentTransformer`${substitution}`
      }
      return substitution
    }
  }
}

The user can now "generate" its own literal transformer easily:

const newSyntax = new TemplateTag(literalTransformer('!', safeHtml))

Or even combine more than one literal transformer:

const yesNo = new TemplateTag({
  onSubstitution(substitution) {
    return substitution ? 'yes' : 'no'
  }
})

const newSyntax= new TemplateTag(
  literalTransformer('!', safeHtml),
  literalTransformer('?', yesNo)
)

html(newSyntax)`?${true} !${foo} ${bar}`
// => "yes &lt;span&gt;foo&lt;/span&gt; <span>bar</span>"

Note: Instead of defining escapeStringRegexp, it may be better to use escape-string-regexp (which, by the way, could be added as a useful transformer).

Yet another proposal

literalTransformer only checks that its sequence argument is not empty. Maybe more checks are needed... Anyway, my point is: it is possible using sequences longer than one character. But, when doing this, the order in which transformers are passed to TemplateTag has an importance if some sequence is a sub-string of another one, as shown by the following example:

const s1 = new TemplateTag(
  literalTransformer('$$', safeHtml),
  literalTransformer('$', yesNo)
)
html(s1)`$${true} $$${foo} ${bar}`
// => OK: "yes &lt;span&gt;foo&lt;/span&gt; <span>bar</span>"

const s2 = new TemplateTag(
  literalTransformer('$', yesNo),
  literalTransformer('$$', safeHtml)
)
html(s2)`$${true} $$${foo} ${bar}`
// => KO: "yes $yes <span>bar</span>"

That's why I think that common-tags could also provide a function to automatically reorder literal transformers:

const createSyntax = syntax => Object.keys(syntax)
  .sort((a, b) => a.length < b.length)
  .map(sequence => literalTransformer(sequence, syntax[sequence]))

Which gives:

const newSyntax = new TemplateTag(createSyntax({
  '$': yesNo,
  '$$': safeHtml
}))
html(newSyntax)`$${true} $$${foo} ${bar}`
// => OK: "yes &lt;span&gt;foo&lt;/span&gt; <span>bar</span>"
zspecza commented 6 years ago

These are great ideas, but I'm afraid they veer too far away from simplicity. One of the goals for 2018 is to remove the concept of transformers entirely - everything becomes a template tag, and all template tags are infinitely composable. I've outlined a roadmap for version 2.0 of common-tags in which I've borrowed your idea but simplified it by changing the method signature of transformer methods to include an object as the second parameter which contains information about the entire template literal.

The roadmap is not public yet however as I'm still thinking about a few things (like whether to have the composition in createTag or add a new composeTags function).

pizzafroide commented 6 years ago

These are great ideas, I'm afraid they veer too far away from simplicity.

You're right: I've read the roadmap for version 2.0 and I think too it'll be way easier implementing those kind of things in the new "everything's a TemplateTag" pattern.

Concerning the roadmap:

  1. I have a tiny suggestion about the "transformer context parameter" in sprint 5:

    Each transformer method (onSubstitution, onString, onEndResult) will receive a new parameter which defaults to an empty object.

    Wouldn't it be better to pass an object with a context property instead of an empty object? It would fit the chunk parameter specifications from sprint 9, and avoid compatibility issues.

    import { createTag } from 'common-tags';
    
    const listSubs = createTag({
      onSubstitution(sub, result, { context }) {
        context.subs = (context.subs || []).concat({ sub, precededBy: result });
        return sub;
      },
      onEndResult(res, { context }) {
        return context;
      }
    });
  2. Also I'd have some questions about sprint 9:

    Built-in transformers will be converted to template tags, and the less used template tags or template tags that only had one transformer will be removed or combined into generic abstractions where necessary.

    Does this mean that tags such as commaLists*, for example, will be removed in favor of a new "tag generator" replacing inlineArrayTransformer(opts)? Something like:

    inlineArray({ 
      separator: ',', 
      conjunction: 'or'
    })`${['apples', 'bananas', 'watermelons']}`
    // => "apples, bananas or watermelons"

    And would some tags also be added? I think about escapeRegexp for example; or escapeHtml (different from safeHtml in that it would operate on the end result rather than the substitutions).

zspecza commented 6 years ago

Wouldn't it be better to pass an object with a context property instead of an empty object? It would fit the chunk parameter specifications from sprint 9, and avoid compatibility issues.

I suppose it would not hurt.

Does this mean that tags such as commaLists*, for example, will be removed in favor of a new "tag generator" replacing inlineArrayTransformer(opts)? Something like:

Precisely. I'm aiming to borrow some naming convention and generalization hints from parser combinator libraries like Parsec/Parsimmon. So the <x>List template tags will be replaced by new ones. sepBy would be the main one for that purpose, which adds a separator and then users would use common-tags' composition utilities for edge cases like serial commas and conjunctions.

And would some tags also be added? I think about escapeRegexp for example; or escapeHtml (different from safeHtml in that it would operate on the end result rather than the substitutions).

Yes, I have a few tags I'm planning to add to the core library. That's what common-tags is all about :)

Back on the topic of this PR - do you think you could rename the method (or add an alias) to onString? (onLiteral is a little confusing as all the methods effectively operate on "literals")

Then we can get this merged in :ship:

pizzafroide commented 6 years ago

Sure, here you go! I opt for renaming as onString. They call it "string", "text" or "string text" on MDN; it could also have been onText, or onText could have been an alias. Though I find an alias for such a feature would introduce unnecessary complexity... So, onString it is.

zspecza commented 6 years ago

Thanks, @pizzafroide!