Closed mhulse closed 8 years ago
@kangax, to clarify this issue, does that mean that there should be the option for people to specify their own start/end tags for the ignored pattern match?
One situation I can think of might be opening/closing tags for something like Django templates:
{% block foo %}html stuff goes here{% endblock foo %}
{# this is a single line server-side comment #}
{{ variable }}
... where the pattern to match/ignore would be based on {%
, {#
or {{
(for opening tags) and %}
, #}
or }}
(for closing tags)?
In other words, should the developer be allowed to pass in a regex/pattern to the code for ignoring tags of their choice? If so, should this custom regex/pattern override the existing "ignore tags" regex or should it get appended-to (i.e. on top of matching <?
, <%
, ?>
, and %>
, it would also match these custom opening/closing tags: {%
, {#
, {{
, %}
, #}
and }}
)?
In other words, should the developer be allowed to pass in a regex/pattern to the code for ignoring tags of their choice?
Yes, I think that would make sense.
If so, should this custom regex/pattern override the existing "ignore tags" regex or should it get appended-to
I think most definitely replaced. If I'm working with ruby, I don't care to do anything about "<?", "?>" strings. I would just want to say something like:
ignoreStartTag: /<\%+/,
ignoreEndTag: /\%>/
and only get those stripped.
The default ones we introduced are just a convenience for 2 most popular languages, and/or until we introduce custom syntax registering.
Does this sound reasonable?
Does this sound reasonable?
Definitely! :smile:
I don't think it would be hard to introduce as a feature either; it should just be as simple as adding two new options and then using those options to search for start/end tags instead of the hard-coded regex variables.
Like you say, the defaults for the options would be regex pattern that searches for <?
, <%
and ?>
, %>
).
If you're open to the help, I could probably tackle this feature this weekend. :octocat:
This is not urgent by any means, I think, so definitely take your time. Thanks for all the help so far!
Sounds good! Thanks Juriy! :+1:
This feature would be very helpful in my latest project. :+1:
I think I could do the PR this weekend. Fortunately, it should be pretty easy/fast to do. :octocat:
@Luis-TP Based on the tags you need to ignore, could you provide an open/close tag example regex for me to test with?
I think I have a working patch, I just need write a few more tests at this point.
@kangax, do you have any preference in terms of the implementation of registration of custom syntax regexes?
Here's a few ideas ... First, the simplest approach:
ignoreStartTag: typeof options.ignoreStartTag !== 'undefined' ? options.ignoreStartTag : /<(%|\?)/,
ignoreEndTag: typeof options.ignoreEndTag !== 'undefined' ? options.ignoreEndTag : /(%|\?)>/,
ignore: function(text, startTag, endTag) {
buffer.push(options.removeIgnored ? '' : startTag + text + endTag);
},
...or, this approach might allow for better future growth:
ignore: function(text, startTag, endTag) {
buffer.push(options.removeIgnored ? '' : startTag + text + endTag);
},
register: {
ignore: {
startTag: typeof options.ignoreStartTag !== 'undefined' ? options.ignoreStartTag : /<(%|\?)/,
endTag: typeof options.ignoreEndTag !== 'undefined' ? options.ignoreEndTag : /(%|\?)>/
}
}
... or, this feels better to me, but it would break backwards compatability:
ignore: {
startTag: typeof options.ignoreStartTag !== 'undefined' ? options.ignoreStartTag : /<(%|\?)/,
endTag: typeof options.ignoreEndTag !== 'undefined' ? options.ignoreEndTag : /(%|\?)>/,
callback: function(text, startTag, endTag) {
buffer.push(options.removeIgnored ? '' : startTag + text + endTag);
}
},
It's late, so maybe I'm over-thinking this?
Any thoughts/preferences on implementation here?
Thanks!
I am working with handlebars. Open and close tags are {{ ... }}
and {{{ ... }}}
.
Simplest regex pattern I can think of, that matches both cases, is:
ignoreStartTag: /\{{2,3}/,
ignoreEndTag: /\}{2,3}/
From what I can tell, a closing tag of 3 characters might present a problem.
@Luis-TP Thanks for example! I'll build it into my tests. :+1:
@mhulse I'm thinking something like this as the minifier API:
minify(input, { ignoreStartTag: '...', ignoreEndTag: '...', ... });
and then we'd pass those "tags" to parser (for use instead of startIgnore
& endIgnore
). Right?
@kangax that looks good to me. Thanks for feedback.
I'll send a PR today so we can discuss the details further (if need be). :+1:
@mhulse awesome, thanks Micky!
Well, I thought this would be easy, but with anything programming related, I have been proven wrong. :laughing:
I was able to get a passing test for @kangax's suggestion of /<\%+/
, but when I tried matching /\{{2,3}/
, my code would never get a chance to "see" those spots in the HTML string.
Here's where I left off:
https://github.com/mhulse/html-minifier/commit/d470390e7f8892207738bf524670170bfed48154
Because opening tags like {{
{{{
are not flagged as HTML (somewhere along the html-minifier
's code path ... I have yet to figure this out due to complexity of code) the aforementioned characters get treated as normal text (not HTML) and therefore my code (i.e., else if (html.search(handler.ignoredStartTag) === 0) { ...
) never has a chance to parse things that don't start with an angle brace <
(in other words, non-<
elements are treated as chars
and the other parts of the script take over).
I think I need to take a break and come back at this with a fresh outlook (my Sunday is almost over anyway).
In the meantime, does anyone else have any tips on how best to handle this scenario? I think I could use a fresh pair of eyeballs to look at my logic. :+1:
Apologies, I should have clarified that the issues I am having are occurring when using handlebars for dynamic tag attributes. For instance:
<button {{#if disabled}}disabled{{/if}}>...</button>
As you can see even github's syntax highlighting marks it as invalid html.
If the {{...}}
occurs within the attributes quoted value, its not a problem:
<button class="{{#if primary}}primary{{/if}}">...</button>
Furthermore if the {{...}}
occurs within plain text it is usually not a problem either:
<div>{{ example }}</div>
Perhaps I jumped the gun in thinking that custom syntax could resolve this issue? Would this need to be handled a different way?
@Luis-TP Thanks for clarification. I need to take some time to think about your situation, so I don't have anything helpful to say at the moment.
Running your first example through the online html-minifier
, I see this error:
Parse Error:
<button {{#if disabled}}disabled{{/if}}>...</button>
So, your code doesn't even make it through the parser. I see now.
Perhaps I jumped the gun in thinking that custom syntax could resolve this issue? Would this need to be handled a different way?
Hrmm, that is a good question. I have not used handlebars (yet) myself, so a piece of the puzzle is missing for me.
As it stands currently, my code would need to be moved/refactored in order to account for custom syntax like you posted.
But now that I know more about what you're trying to do, I'll take another look at the html-minifier
's code ... We might be able to handle this in another location, separate from "ignored" tag checking.
With that said, my new code does now allow for someone to pass custom regex into the custom tag checking bits. On one hand, it won't handle {{
or }}
template tags, but it will handle tags starting/ending with <
and >
.
@kangax If we're thinking purely in terms of small, iterative, "improvements" to the code, then I could finish what I'm working on and do a PR. From there, we could open a new issue and continue our talk on how to best handle template tags, like what @Luis-TP posted. How does that sound?
Of course, I don't mind going back to drawing board if we want to try and account for any/all "ignored" type of tags (including template tags like handlebars) in one set of "ignoreXXXXXTag
" options.
If we're thinking purely in terms of small, iterative, "improvements" to the code, then I could finish what I'm working on and do a PR. From there, we could open a new issue and continue our talk on how to best handle template tags, like what @Luis-TP posted. How does that sound?
I suppose that sounds good. I'm thinking that perhaps string-based tags could be a nice option for now. For example:
ignoreStartTag: '<%=',
ignoreEndTag: '%>'
or even something like:
ignoreTags: [[ '<%=', '%>' ], ['<%', '%>']]
or:
ignoreTags: ['<%=', '%>', '<%', '%>']
Any progress on this? I could really use this.
@Thomasdezeeuw I've been meaning to re-visit at some point. I've been wrapped up in another project since I last left off here. #98 is related, and I'm going to try to carve out a weekend to do a PR on both.
In the meantime, if you could provide example code (or describe your needs in detail) for your particular situation, that would probably be helpful. The more examples I can use to test against, the better. :+1:
Has there been any progress on this? I was just trying to minify some mustache templates that may contain markup like...
<div class="name">{{name}}</div>
{{#address}}
{{>addressTemplate}}
{{/address}}
{{^address}}
{{>addressFormTemplate}}
{{/address}}
but minify
just seems to just hang with that input.
Last time I left off was here: https://github.com/kangax/html-minifier/issues/83#issuecomment-28165671
I haven't used the htmlminifier much since then, so my head is out of the code. I'd love to give back to this project, but my free time has been limited lately. :frowning:
Anyone currently available to tackle this issue?
Based on work I've done above, and @kangax's comments here https://github.com/kangax/html-minifier/issues/83#issuecomment-28647611, seems like we could be close to having a working patch.
@broox doesn't hang for me. I get output like this:
<div class=name>{{name}}</div>{{#address}} {{>addressTemplate}} {{/address}} {{^address}} {{>addressFormTemplate}} {{/address}}
@kangax I forgot to follow up on this, but later realized that html was silently failing on one of the included templates.
Hi, what is the status of support for custom tags like "{{" and "{%"?
Based on the code introduced in this PR:
New feature: Ignoring tags — Specifically, ignoring strings that start with
<?
or<%
and end with?>
or%>
.... @kangax suggests: