terser / html-minifier-terser

actively maintained fork of html-minifier - minify HTML, CSS and JS code using terser - supports ES6 code
https://terser.org/html-minifier-terser
MIT License
376 stars 30 forks source link

[Bug]: `<meta name="viewport" content="width=device-width initial-scale=1">` is incorrectly minified #178

Open davidmurdoch opened 2 months ago

davidmurdoch commented 2 months ago

What happened?

The parsing algorithm for the CSS viewport content attribute is defined here: https://drafts.csswg.org/css-viewport/#parsing-algorithm

The following characters are valid separators in some places (but not all):

Horizontal tab (0x09)
Line feed (0x0a)
Carriage return (0x0d)
Space (0x20)
Comma (0x2c)
Semicolon (0x3b)

3.2. Parsing algorithm

Below is an algorithm for parsing the content attribute of the tag produced from testing Safari on the iPhone. The testing was done on an iPod touch running iPhone OS 4. The UA string of the browser: "Mozilla/5.0 (iPod; U; CPU iPhone OS 4_0 like Mac OS X; en-us) AppleWebKit/532.9 (KHTML, like Gecko) Version/4.0.5 Mobile/8A293 Safari/6531.22.7". The pseudo code notation used is based on the notation used in [Algorithms].

The whitespace class contains the following characters (ascii):

The recognized separator between property/value pairs is comma for the Safari implementation. Some implementations have supported both commas and semicolons. Because of that, existing content use semicolons instead of commas. Authors should be using comma in order to ensure content works as expected in all UAs, but implementors may add support for both to ensure interoperability for existing content.

The separator class contains the following characters (ascii), with comma as the preferred separator and semicolon as optional:

Parse-Content(S)
1   i ← 1
2   while i ≤ length[S]
3    do while i ≤ length[S] and S[i] in [whitespace, separator, '=']
4      do i ← i + 1
5    if i ≤ length[S]
6      then i ← Parse-Property(S, i)
Parse-Property(S, i)
1   start ← i
2   while i ≤ length[S] and S[i] not in [whitespace, separator, '=']
3     do i ← i + 1
4   if i > length[S] or S[i] in [separator]
5     then return i
6   property-name ← S[start .. (i - 1)]
7   while i ≤ length[S] and S[i] not in [separator, '=']
8     do i ← i + 1
9   if i > length[S] or S[i] in [separator]
10    then return i
11  while i ≤ length[S] and S[i] in [whitespace, '=']
12    do i ← i + 1
13  if i > length[S] or S[i] in [separator]
14    then return i
15  start ← i
16  while i ≤ length[S] and S[i] not in [whitespace, separator, '=']
17    do i ← i + 1
18  property-value ← S[start .. (i - 1)]
19  Set-Property(property-name, property-value)
20  return i

Set-Property matches the listed property names case-insensitively. The property-value strings are interpreted as follows:

  1. If a prefix of property-value can be converted to a number using strtod, the value will be that number. The remainder of the string is ignored.
  2. If the value can not be converted to a number as described above, the whole property-value string will be matched with the following strings case-insensitively: yes, no, device-width, device-height
  3. If the string did not match any of the known strings, the value is unknown.

Version

v7.2.0

What browsers are you seeing the problem on?

No response

Link to reproduce

No response

Relevant log output

create an HTML file as `index.html` with the following contents:

<!DOCTYPE html>
<html>
  <head>
    <meta name="viewport" content="width=device-width initial-scale=1.0" />
  </head>
  <body></body>
</html>

then run npx -y html-minifier-terser@latest index.html

The content attribute's content, width=device-width initial-scale=1.0, will be minified to width=device-widthinitial-scale=1.0, which is now invalid. The value should remain unchanged in this case.



### Willing to submit a PR?

None