sass / dart-sass

The reference implementation of Sass, written in Dart.
https://sass-lang.com/dart-sass
MIT License
3.95k stars 358 forks source link

HTML entities are replaced with the actual symbols in the compiled css #1219

Closed mtskf closed 3 years ago

mtskf commented 3 years ago

🐛 bug report

HTML entities are replaced with the actual symbols in the compiled css. And thus, the symbols show up garbled.

For example: If I comple scss with a pseudo element with html entity like this:

Source SCSS:

@charset "UTF-8";
#test:before { content: "\2713"; }

Compiled css:

@charset "UTF-8";
#test:before {
  content: "✓";
}

The html entity '\2713' is replaced with the actual symbol '✓' in the complied css, and then, it's shown like this "e✓" on browsers.

nex3 commented 3 years ago

The UTF-8 character is semantically identical to the escape as long as the CSS file itself is being parsed as UTF-8, and as long as the @charset declaration is there all browsers should parse it as such. What browser version are you using? Can you provide a website that reproduces the mangled rendering?

lhtin commented 3 years ago

Hi, I ran into the same problem, here the information:

System Info

Reproduce

Create two files:

const sass = require('sass'); const result2 = sass.renderSync({file: './test.scss'}); console.log('sass:\n', result2.css.toString());


- `test.scss` file:
```scss
.a-icon {
  content: "\E91E";
}

Then run node test.js, output like this:

image

The problem is the output of node-sass and sass is diffrentent, and I think the node-sass's output is right. Thank you for your time.

lhtin commented 3 years ago

When I view the output with the Hex Fiend tool, I found the different:

src: 5C45393145
after node-sass: 5C45393145
after sass: EEA49E

Sass converts the escape string (5C45393145) to the real value(EEA49E), but node-sass doesn't do that. I think the escape string is more suitable in CSS and the real value will cause Chrome to render icon font to garbled sometimes:

image

nex3 commented 3 years ago

As above, please provide an example of a browser version and web page where this is rendering incorrectly, because according to the CSS spec the two examples you provided have identical semantics.

lhtin commented 3 years ago

Here the little demo, view like the following screenshot. And I found that if I omit the @charset "UTF-8" in CSS file, then the real value one doesn't work, but the escape one works fine. So, you are right, but maybe the escape one has better compatibility for browsers to detect the CSS file's charset. My CSS file not be added the @charset "UTF-8" when using sass-loader to compile the SCSS file, so the problem occurred.

Screenshot: image

nex3 commented 3 years ago

If you're deleting the @charset declaration, then you're changing the semantics of the CSS from what Sass generates and you shouldn't be surprised that it can render incorrectly. If sass-loader is removing it, that sounds like a bug; I suggest you file it with them.

To be clear, we don't choose to emit real Unicode characters capriciously. We do so because it's considerably more compact than generating escapes and (more importantly) because otherwise anyone writing class names, comments, or content strings in non-ASCII-friendly languages would find their compiled stylesheets hopelessly illegible if they were just a bunch of escape codes. If you would prefer escapes, you can always postprocess the CSS with a tool like this one.

lhtin commented 3 years ago

Thank you so much for your reply. Is it better to keep the original content than to always emit Unicode characters? That is to say regardless of people use Unicode characters or the escape, always reserve it in the compiled file. I think the unchanged content is more friendly. It not only can keep Uncode characters in non-ASCII-friendly languages but also can keep the escape for icon-font used.

nex3 commented 3 years ago

Whether a character was written as an escape sequence or as a literal character is a detail of its parsing. There's no efficient way to preserve that information through to the point where that character gets serialized to CSS again.

lhtin commented 3 years ago

Really? Maybe you can introduce a flag about escaped or not of tokens when parsing, then you can use it to determine how to serialize. Another way maybe just to treat escape characters as ASCII characters and let the browser do escape parse. I found node-sass's behavior like that:

scss file:

.a-icon {
  content: "\E91E";
}
.a-iconb {
  content: "你弽"
}

after compiling with node-sass:

@charset "UTF-8";
.a-icon {
  content: "\E91E"; }

.a-iconb {
  content: "你弽"; }

after compiling with sass:

@charset "UTF-8";
.a-icon {
  content: "";
}

.a-iconb {
  content: "你弽";
}
nykoleks commented 3 years ago

Hi there. I have the same problem content:"\e935"; in .scss converting to content:"" in .css This is very bad for working with font-icon. It broke bootstrap-glyphicon, fontawesome, any custom created font-icon... terribly... How I need to write to get: if I wrote "\e935" in .scss then I get "\e935" in .css?

\\ - this is dosent work - output: \\ I tryed: --no-unicode and --no-charset and wrote @charset "ASCII"; (and @charset "UTF-8") at top of base file with above and without above, together and separated;

what I need to do to force compiling code correct???

MacBook OS: Big Sure sass: Dart-sass 1.32.7 installed with brew (brew install sass/sass/sass)

Before "global" update I used Ruby Sass and all works fine.

works next hack:


@function symbol-fix($symbol){
  $ret: '\'\\#{$symbol}\'';
  @return $ret;
}
.i-test:before {
  content: #{symbol-fix(e900)};
}

But it is very very very bad solution. It works for one little project what was needed a very quick solution, but it is not applicable for large projects. So if sass developers will not create fast and good solution for this issue, then (I don't like this solution but have no choice) I will must to convert all sass in to the .less. Please do not force me to do this... I didn't like .less... cr**...

nex3 commented 3 years ago

I'm closing this as a duplicate of #568, since it looks like there there isn't a case where a browser is actually rendering the UTF-8 character incorrectly. I'll still address some of the outstanding questions here, though.

@lhtin

Maybe you can introduce a flag about escaped or not of tokens when parsing, then you can use it to determine how to serialize.

What do we do about a stylesheet that uses some escapes and some literal Unicode characters? What if we have a situation where a user is (incorrectly) relying on their stylesheet to be emitted as plain-ASCII, but then a dependency uses the word "naĂŻve" in a comment and breaks them?

Generally speaking, using heuristics like this just makes the behavior feel even more inconsistent and capricious than having a configurable option.

Another way maybe just to treat escape characters as ASCII characters and let the browser do escape parse.

This would violate Sass's fundamental design principle of being a CSS superset. According to CSS, the token "\2603" and the token "☃" are identical in meaning, but if we just didn't touch escape codes then "\2603" == "☃" would return false. It would similarly break all of Sass's string functions.

@nykoleks

As I explained above, the Unicode character has the exact same semantics in CSS as the escape code. Even though it looks different if you inspect the generated CSS in a text editor, it won't cause the browser to render it any different. (Unless you mess with the @charset declaration at the top, in which case—don't do that!)

If you want a workaround, I'll again recommend using a postprocessor like postcss-sass-unicode which will convert UTF-8 characters into escape codes.

lhtin commented 3 years ago

This would violate Sass's fundamental design principle of being a CSS superset. According to CSS, the token "\2603" and the token "☃" are identical in meaning, but if we just didn't touch escape codes then "\2603" == "☃" would return false. It would similarly break all of Sass's string functions.

  1. Since the two tokens are identical, It should not conflict with the design principles that try to keep tokens which valid in CSS unchanged after compiling from Sass source, right? I think the unchanged encode of characters is very friendly and important for Sass's users.
  2. For the break of Sass's string functions, can it be solved by modifying the implements of those functions?
lhtin commented 3 years ago

Hi there. I have the same problem content:"\e935"; in .scss converting to content:"" in .css This is very bad for working with font-icon. It broke bootstrap-glyphicon, fontawesome, any custom created font-icon... terribly... How I need to write to get: if I wrote "\e935" in .scss then I get "\e935" in .css?

\\ - this is dosent work - output: \\ I tryed: --no-unicode and --no-charset and wrote @charset "ASCII"; (and @charset "UTF-8") at top of base file with above and without above, together and separated;

what I need to do to force compiling code correct???

MacBook OS: Big Sure sass: Dart-sass 1.32.7 installed with brew (brew install sass/sass/sass)

Before "global" update I used Ruby Sass and all works fine.

works next hack:


@function symbol-fix($symbol){
  $ret: '\'\\#{$symbol}\'';
  @return $ret;
}
.i-test:before {
  content: #{symbol-fix(e900)};
}

But it is very very very bad solution. It works for one little project what was needed a very quick solution, but it is not applicable for large projects. So if sass developers will not create fast and good solution for this issue, then (I don't like this solution but have no choice) I will must to convert all sass in to the .less. Please do not force me to do this... I didn't like .less... cr**...

@nykoleks Can you check that the output file(CSS file) has @charset "UTF-8"; string or not in begin? If not exist, you can add it and play again.

nex3 commented 3 years ago
  • Since the two tokens are identical, It should not conflict with the design principles that try to keep tokens which valid in CSS unchanged after compiling from Sass source, right? I think the unchanged encode of characters is very friendly and important for Sass's users.
  • For the break of Sass's string functions, can it be solved by modifying the implements of those functions?

Neither of these are technically feasible. Tracking the original state of each character in a string would require a considerable amount of memory and processing overhead for every string Sass manages, and trying to decode escapes on-the-fly in every string function would similarly add a massive overhead to those functions (including extremely unintuitive performance characteristics like str.length() being O(n)). Neither of these are a better solution than simply globally choosing the encoding of the output.

lhtin commented 3 years ago

According to your answer, I'm curious why node-sass can do that? Just like the above example:

scss file:

.a-icon {
  content: "\E91E";
}
.a-iconb {
  content: "你弽"
}

after compiling with node-sass:

@charset "UTF-8";
.a-icon {
  content: "\E91E"; }

.a-iconb {
  content: "你弽"; }

after compiling with sass:

@charset "UTF-8";
.a-icon {
  content: "";
}

.a-iconb {
  content: "你弽";
}
nex3 commented 3 years ago

LibSass's string parsing is outdated and incorrect. It doesn't follow CSS semantics and won't behave correctly with string functions or equality. This is part of the reason that LibSass is deprecated.

RYJASM commented 3 years ago

I am currently experiencing this issue as well. Even though the unicode character is there, many of the fonts used in my editor don't have that symbol, so I cannot tell what it is any longer. With the code, I could.

This is an issue because I'd like to keep a tabulated list of what values are assigned to what class. Without being able to see that in the css, it becomes difficult to diagnose issues or edit code and refactor it within systems after the sass has been compiled.

I also have issues actually using the code, because of the limited character set in several of the systems' databases that I use and only have front end or mid end access to. They simply will not accept the outputted code any longer due to the odd characters now emitted by the compiler.

To me it's a major drawback for sass/scss to convert the characters to something different than what I intended and seems akin to changing color values like #fff to named colors or not respecting my chosen gender. I'd certainly hate to be given a different gender at birth and not get to go with what I choose.

It's the same with writing the representation of a value vs converting it.

This is too hands on for a compiler. The default action should be to leave it as is and only convert it when a special marker is in place.

So to any of you magical coders out there doing the right thing and fixing issues, the calls of humanity are upon you and we are waiting on our knees for your kindness.

cbush06 commented 3 years ago

I am having the same issue when trying to use Font Awesome unicodes in CSS content properties for :before pseudo elements with an Angular CLI project. I'm reverting to node-sass and hope that Dart Sass will fix this issue in the future.

Awjin commented 3 years ago

@RYJASM I understand you might be frustrated, but let's keep this discussion centered on code. It's inappropriate to equate Sass's string parsing (which correctly follows CSS semantics and avoids bad performance) to the ongoing trauma and prejudice of gender issues.

RYJASM commented 3 years ago

Code that doesn’t do what we need isn’t acting correctly. Performance of a toilet simply doesn’t matter when it requires the excrement to be first puréed and pressed into cubes before flushing. It shouldn’t be altering the values we put in for the content property regardless. You and I both know rules are simply written ideas that should fit what we want. And we don’t want our content values being converted to other characters.

If the rules are the way they are. Change them and then fix this issue. Because it is an issue regardless of what the rules currently say.

On Tue, Mar 2, 2021 at 1:40 PM Awjin Ahn notifications@github.com wrote:

@RYJASM https://github.com/RYJASM I understand you might be frustrated, but let's keep this discussion centered on code. It's inappropriate to equate Sass's string parsing (which correctly follows CSS semantics and avoids bad performance) to the ongoing trauma and prejudice of gender issues.

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/sass/dart-sass/issues/1219#issuecomment-789162985, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB63S7BXGMZY64WYVGKCJGTTBU5JBANCNFSM4WWJO4KA .

-- Ryan Smith

RYJASM commented 3 years ago

The issue at this point seems to be less about what is needed and expected and more about breaking through the ego of those protecting the perfectly written rules and performance, rather than focusing on the correct and expected behavior of the sass compiler.

I’ve seen this issue closed down over and over again, when the issue still exists and needs to be addressed.

On Tue, Mar 2, 2021 at 1:54 PM Ryan Smith rjsmith2007@gmail.com wrote:

Code that doesn’t do what we need isn’t acting correctly. Performance of a toilet simply doesn’t matter when it requires the excrement to be first puréed and pressed into cubes before flushing. It shouldn’t be altering the values we put in for the content property regardless. You and I both know rules are simply written ideas that should fit what we want. And we don’t want our content values being converted to other characters.

If the rules are the way they are. Change them and then fix this issue. Because it is an issue regardless of what the rules currently say.

On Tue, Mar 2, 2021 at 1:40 PM Awjin Ahn notifications@github.com wrote:

@RYJASM https://github.com/RYJASM I understand you might be frustrated, but let's keep this discussion centered on code. It's inappropriate to equate Sass's string parsing (which correctly follows CSS semantics and avoids bad performance) to the ongoing trauma and prejudice of gender issues.

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/sass/dart-sass/issues/1219#issuecomment-789162985, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB63S7BXGMZY64WYVGKCJGTTBU5JBANCNFSM4WWJO4KA .

-- Ryan Smith

-- Ryan Smith

cbush06 commented 3 years ago

Sweet Jesus. Can we not make this a political issue? There are tickets open for this. There are multiple workarounds in the meantime. Take societal issues (gender or otherwise) elsewhere.

RYJASM commented 3 years ago

With all due respect, I think it’s a perfect analogy. Let’s not bring religion into this conversation.

On Tue, Mar 2, 2021 at 1:57 PM Clinton Bush notifications@github.com wrote:

Sweet Jesus. Can we not make this a political issue? There are tickets open for this. There are multiple workarounds in the meantime. Take societal issues (gender or otherwise) elsewhere.

cbush06 commented 3 years ago

With all due respect, I think it’s a perfect analogy. Let’s not bring religion into this conversation. On Tue, Mar 2, 2021 at 1:57 PM Clinton Bush @.***> wrote: Sweet Jesus. Can we not make this a political issue? There are tickets open for this. There are multiple workarounds in the meantime. Take societal issues (gender or otherwise) elsewhere. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1219 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB63S7DFSZBNTY2K6STCHT3TBU7KJANCNFSM4WWJO4KA . -- Ryan Smith

It's an expression or phrase. Like "OMG!"

RYJASM commented 3 years ago

Oh I'm sorry I was using my built in compiler for expressions and it converted what you wrote to mean that you were calling on support of Jesus Christ lord and savior instead of the intended meaning that you wrote. My bad.

This is just another example of how it is important to not try to translate what is written when it's not necessary or expected.

On Tue, Mar 2, 2021 at 2:00 PM Clinton Bush notifications@github.com wrote:

With all due respect, I think it’s a perfect analogy. Let’s not bring religion into this conversation. On Tue, Mar 2, 2021 at 1:57 PM Clinton Bush @.***> wrote: Sweet Jesus. Can we not make this a political issue? There are tickets open for this. There are multiple workarounds in the meantime. Take societal issues (gender or otherwise) elsewhere. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1219 (comment) https://github.com/sass/dart-sass/issues/1219#issuecomment-789173519>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB63S7DFSZBNTY2K6STCHT3TBU7KJANCNFSM4WWJO4KA . -- Ryan Smith

It's an expression or phrase. Like "OMG!"

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/sass/dart-sass/issues/1219#issuecomment-789175374, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB63S7BVRHW7TWODD26VXW3TBU7WFANCNFSM4WWJO4KA .

nex3 commented 3 years ago

This issue is getting a bit heated, so I'm going to lock it. Here's the final summary: