yeoman / grunt-usemin

[UNMAINTAINED] Replaces references to non-optimized scripts or stylesheets into a set of HTML files (or any templates/views)
BSD 2-Clause "Simplified" License
1.22k stars 339 forks source link

Data attributes containing angular "hidden" fields (e.g. data-smth="obj.$$field") are mistakenly stripped of the second $ #480

Open ptanov opened 10 years ago

ptanov commented 10 years ago

If in an html file processed by usemin there is something like this:

<input data-ng-model="obj.$$field" .... />

then the result will be:

<input data-ng-model="obj.$field" .... />

(without the double $) The attribute is matched by this pattern:


var _defaultPatterns = {
  html: [...
    [
      /data-(?!main).[^=]+=['"]([^'"]+)['"]/gm,
      'Update the HTML with data-* tags'
    ]

then in https://github.com/yeoman/grunt-usemin/blob/1b5fcefc979d6c904e2957721a94c39971115633/lib/fileprocessor.js#L207

      var res = match.replace(src, filterOut(file));

match is data-ng-model="obj.$$field" src is obj.$$field filterOut(file) is obj.$$field and res is obj.$field

This is because the symbol $ is used for groups, e.g. $1, $2, etc. ( see https://developer.mozilla.org/en/docs/Web/JavaScript/Guide/Regular_Expressions#special-capturing-parentheses ). I can see that the symbol $ is escaped with a second $ (but I can't find where this is actually noted in the specification).

In order to use usemin task I have copied the default html patterns in my gruntfile and then I have explicitly set filterOut (rxl[3]):


function(m) {
          return m.replace(/\$/g, '$$$$');
      }

instead of using default case:

    var filterOut = rxl[3] || identity;
...
  var identity = function (m) {
    return m;
  };

The whole grunt configuration:


            usemin: {
                html: ["..."],
                options: {
                    assetsDirs: ["..."],
                    patterns: {
                        //from node_modules/grunt-usemin/lib/fileprocessor.js

                        "html": [
                            [
                                /<script.+src=['"]([^"']+)["']/gm,
                                'Update the HTML to reference our concat/min/revved script files'
                            ],
                            [
                                /<link[^\>]+href=['"]([^"']+)["']/gm,
                                'Update the HTML with the new css filenames'
                            ],
                            [
                                /<img[^\>]*[^\>\S]+src=['"]([^"']+)["']/gm,
                                'Update the HTML with the new img filenames'
                            ],
                            [
                                /<video[^\>]+src=['"]([^"']+)["']/gm,
                                'Update the HTML with the new video filenames'
                            ],
                            [
                                /<video[^\>]+poster=['"]([^"']+)["']/gm,
                                'Update the HTML with the new poster filenames'
                            ],
                            [
                                /<source[^\>]+src=['"]([^"']+)["']/gm,
                                'Update the HTML with the new source filenames'
                            ],
                            [
                                /data-main\s*=['"]([^"']+)['"]/gm,
                                'Update the HTML with data-main tags',
                                function (m) {
                                    return m.match(/\.js$/) ? m : m + '.js';
                                },
                                function (m) {
                                    return m.replace('.js', '');
                                }
                            ],
                            [
                                /data-(?!main).[^=]+=['"]([^'"]+)['"]/gm,
                                'Update the HTML with data-* tags',
                                null,
                                function(m) {
                                    return m.replace(/\$/g, '$$$$');
                                }
                            ],
                            [
                                /url\(\s*['"]?([^"'\)]+)["']?\s*\)/gm,
                                'Update the HTML with background imgs, case there is some inline style'
                            ],
                            [
                                /<a[^\>]+href=['"]([^"']+)["']/gm,
                                'Update the HTML with anchors images'
                            ],
                            [
                                /<input[^\>]+src=['"]([^"']+)["']/gm,
                                'Update the HTML with reference in input'
                            ],
                            [
                                /<meta[^\>]+content=['"]([^"']+)["']/gm,
                                'Update the HTML with the new img filenames in meta tags'
                            ],
                            [
                                /<object[^\>]+data=['"]([^"']+)["']/gm,
                                'Update the HTML with the new object filenames'
                            ],
                            [
                                /<image[^\>]*[^\>\S]+xlink:href=['"]([^"']+)["']/gm,
                                'Update the HTML with the new image filenames for svg xlink:href links'
                            ],
                            [
                                /<image[^\>]*[^\>\S]+src=['"]([^"']+)["']/gm,
                                'Update the HTML with the new image filenames for src links'
                            ],
                            [
                                /<(?:img|source)[^\>]*[^\>\S]+srcset=['"]([^"'\s]+)(?:\s\d[mx])["']/gm,
                                'Update the HTML with the new image filenames for srcset links'
                            ]
                        ]

                    }
                }
            },

This function should be added in all patterns. I hope you understand what I'm trying to explain here.

Thanks

stephanebachelier commented 9 years ago

@ptanov not sure if we need to added it to fileprocessor as it is a non standard attribute.

But you can still provide you regexp to pattern options for usemin. I think your regexp is not correct:

/data-(?!main)[^=]+=['"]([^'"]+)['"]/gm should be better than : /data-(?!main).[^=]+=['"]([^'"]+)['"]/gm see the dot between (?!main) and [^=] blocks, as it does not catch attributes data-x where x is a single character.

By the way I would use the following regexp, as your are using angular: /data-ng-[^=]+=['"]([^'"]+)['"]/gm.

This gives the following configuration:

usemin: {
  html: 'build/index.html',
  options: {
    patterns: {
      html: [
        [
           /data-ng-[^=]+=['"]([^'"]+)['"]/gm,
           'Update the HTML with data-* tags',
           null,
           function(m) {
              return m.replace(/\$/g, '$$$$');
           }
         ]
      ]
    }
  }
}
ptanov commented 9 years ago

Hello @stephanebachelier, My point is not about how to match a given attribute (the sample code for matching was taken from the grunt-usemin source), but that when an attribute is matched and the attribute value contains a double dollar sign ("$$"), it is replaced with a single dollar sign (which results in an unintended change of the attribute value).

Thank you!

stephanebachelier commented 9 years ago

@ptanov I see. IMO the fileprocessor regexps are standard while any non standard regexp or specific filter should be given by user.

@sindresorhus what do you think about it ?

ptanov commented 9 years ago

@stephanebachelier, I think that we are talking about different things. When I have the attribute data-ng-model="" it is matched by usemin (which is expected and is NOT the problem). The problem is that if the attribute value contains the string "$$" it will be replaced with the string "$" by usemin which is unexpected and undesirable. Is this what you are responding to?

Thank you.

stephanebachelier commented 9 years ago

@ptanov thanks for your explanation. Now I understand your problem. Indeed it's a bug.

Not sure if fileprocessor is involved or if it may be linked to another issue about usemin not correctly handling special characters.

I would prefer not to touch to fileprocessor to add a new regexp, but needs to think about it.

ptanov commented 9 years ago

OK, thanks One solution I can think of is changing these lines in the file: https://github.com/yeoman/grunt-usemin/blob/6119f90bde1c64eddfe1d2717ad98dead9f73768/lib/fileprocessor.js#L187

  var identity = function (m) {
    return m;
  };

  // Replace reference to script with the actual name of the revved script
  regexps.forEach(function (rxl) {
    var filterIn = rxl[2] || identity;
    var filterOut = rxl[3] || identity;

to

  var identity = function (m) {
    return m;
  };
  var identityRegexReplacement = function (m) {
    return m.replace(/\$/g, '$$$$');
  };

  // Replace reference to script with the actual name of the revved script
  regexps.forEach(function (rxl) {
    var filterIn = rxl[2] || identity;
    var filterOut = rxl[3] || identityRegexReplacement;

I think this would not affect anything apart from the problem reported.

Thanks

stephanebachelier commented 9 years ago

@ptanov thanks for coming back, I will test this change. There is another issue about escaping characters so I will work on both at the same time,

stephanebachelier commented 9 years ago

Linking this issue to html parser migration as it might solve this. see #244.

stephanebachelier commented 9 years ago

@ptanov just to let you know that I'll test this problem and others issues related to the block replacement in the dev branch.