rehypejs / rehype-minify

plugins to minify HTML
https://unifiedjs.com
MIT License
90 stars 16 forks source link

[Idea] Is it possible to remove some empty tags? #45

Closed shtse8 closed 2 years ago

shtse8 commented 2 years ago

Initial checklist

Problem

<p></p>
<h1><br/></h1>

Can we remove those empty content tags to further minify?

Solution

As Above

Alternatives

As Above

wooorm commented 2 years ago

This can’t be done with a minifier, who’s goal is to minify without changing meaning (or how things are presented). The empty paragraph, and the empty heading, each have semantics. And they take up space, and get other styles, when a browser renders them

shtse8 commented 2 years ago

This can’t be done with a minifier, who’s goal is to minify without changing meaning (or how things are presented). The empty paragraph, and the empty heading, each have semantics. And they take up space, and get other styles, when a browser renders them

You are right. I didn't realize that if someone styling it, it will change the meaning. Or can we build it as a optional preset? people can take it by their own risk. This option is very common in other minifiers:

In most case people won't apply style into it. Also, with the same point of view, people can apple style into empty attribute which will cause [rehype-remove-empty-attribute](https://github.com/rehypejs/rehype-minify/blob/main/packages/rehype-remove-empty-attribute) changing it's output. Minifying it may also changing something.

wooorm commented 2 years ago

This option is very common in other minifiers:

Yes, it is indeed common. However, I don’t understand why. Many elements are supposed to be empty. Many empty elements are intentionally empty. This rule seems to never work the way you’d want it too.

Or can we build it as a optional preset?

Sure, it could exist. If you want it, why don’t you make it yourself? Not everything has to be here. It could go here, but I just don’t really see how the plugin could work to make it useful instead of breaking everything all the time.

with the same point of view, people can apple style into empty attribute which will cause

You are right, CSS could change anything, but there is a difference in how likely it is.

a) that package uses a list of attributes, described by the HTML standard, of attributes that when empty, are treated the same as not being there. b) it is less likely that a user has styles for whether attributes exist ([class]), or are exactly an empty string ([class=""]), especially for attributes that are supposed to have a value. It is more likely that people select an element (e.g., h2), and hence that that empty h2 is styled.

shtse8 commented 2 years ago

This option is very common in other minifiers:

Yes, it is indeed common. However, I don’t understand why. Many elements are supposed to be empty. Many empty elements are intentionally empty. This rule seems to never work the way you’d want it too.

I think the objective of usage is different. It depends. For the usage of production site, it's surely not recommend because the semantic change. But for the usage of code cleaning, It may be useful. WYSIWYG always generate garbage empty tags like:

I don't see there is any semantic meanings. Also, for code cleaning, semantic is not that important.

Or can we build it as a optional preset?

Sure, it could exist. If you want it, why don’t you make it yourself? Not everything has to be here. It could go here, but I just don’t really see how the plugin could work to make it useful instead of breaking everything all the time.

Please don't misunderstand me. I have already done it using regex to replace empty tags for my use case. the reason why I am discussing here is providing idea and hope the project can go further and cover more use cases. If it is not welcome any discussion, I am interested to close the topic.

with the same point of view, people can apple style into empty attribute which will cause

You are right, CSS could change anything, but there is a difference in how likely it is.

a) that package uses a list of attributes, described by the HTML standard, of attributes that when empty, are treated the same as not being there. b) it is less likely that a user has styles for whether attributes exist ([class]), or are exactly an empty string ([class=""]), especially for attributes that are supposed to have a value. It is more likely that people select an element (e.g., h2), and hence that that empty h2 is styled.

Please check unocss and it is very common to use attribute for styling.

github-actions[bot] commented 2 years ago

Hi! This was closed. Team: If this was fixed, please add phase/solved. Otherwise, please add one of the no/* labels.

wooorm commented 2 years ago

the reason why I am discussing here is providing idea and hope the project can go further and cover more use cases. If it is not welcome any discussion, I am interested to close the topic.

There is a difference between “this idea may not have to be maintained here by us” and “not welcome any discussion”.

Cleaning generated HTML is an interesting goal. If possible, I’d recommend going through rehype-remark and then rehype-remark. That should clean a lot of things if the documents in question are content heavy.