rgrove / crass

A Ruby CSS parser that's fully compliant with the CSS Syntax Level 3 specification.
MIT License
139 stars 14 forks source link

at_rule block contents are simple_blocks #5

Closed Fustrate closed 6 years ago

Fustrate commented 6 years ago

https://gist.github.com/Fustrate/c0134c4074d11c7146e7c3ee65cba8ce

@supports (display: flex) {
  @media screen and (min-width: 900px) {
    article {
      padding: 1rem 3rem;
    }
  }
}

With nested at_rules, the contents should essentially be expressed as a fresh root would be. Notice that the two at_rules are expressed differently in the tree, the first being as expected but the second being a simple_block. The actual article rule is also a simple_block instead of a style_rule.

I'm also pretty sure that what I'm working on is better expressed as a formatter than a transformer, since it turns the tree into a string instead of modifying the tree itself. Everything else is going well aside from this issue, which makes formatting the block contents a lot more complicated than it should be. There are a few tokens I'll have to do more research on (namely cdo/cdc) but I don't anticipate any other issues popping up.

rgrove commented 6 years ago

With nested at_rules, the contents should essentially be expressed as a fresh root would be.

Is this an opinion, or can you cite a section of the parsing spec that backs this up?

In theory I think I agree with you, but I'm not sure the spec currently agrees. Crass adheres to the spec-defined parsing algorithm, so unless I'm misunderstanding the spec and/or Crass isn't following the spec properly, I'm not sure this can change without departing from the spec.

Fustrate commented 6 years ago

It's definitely an opinion, and as you can tell, I'm not great with phrasing; I meant it more as a question than a statement. I can tell from the source comments you did your research on the spec and followed it very closely. I'm already pretty much resigned to this being a case of a strange spec and that there's nothing I or anyone else can do about it. I'm guessing the spec doesn't have an official reference parser/tokenizer, or an associated document with example input and the expected tokenized output. As far as I've been able to find, Crass is in line with the standard way of representing that input.

The current version of my formatter is on my work computer, but the only question left for the majority of everyday stylesheets is that of nested blocks (mostly media rules). I wanted to make sure I asked about this since it definitely makes a difference for how complicated the formatter will be. I'll essentially have to send the array of tokens through a second tokenizer, reading each one and deciding if it's a rule/property/value/etc. If that's what I have to do, then that's what I'll do.

I've waffled back to thinking a formatter is a purely Sanitize-side problem, since with a customizable formatter key under the css config section, it can substitute the user-provided formatter for Crass::Parser.stringify. Crass wouldn't need to change at all unless you want to separate stringify into an actual default formatter object - it doesn't quite seem like it belongs as a random method on Crass::Parser.

Side question, I have a bad habit of getting sidetracked into cleaning up code. Is that something that's worth submitting a PR for? I know some people don't like tools such as rubocop, but I'm willing to set that up and submit PRs for both projects. In line with that, Ruby 1.9 was EOL'd almost 3 years ago, so maybe the utf-8 magic comments are no longer necessary, though I don't quite see a reason for them to have been in the first place?

rgrove commented 6 years ago

It's definitely an opinion, and as you can tell, I'm not great with phrasing; I meant it more as a question than a statement.

Got it. In that case, my answer is that I think Crass is currently doing what the spec says it should do. 🙂

I'm guessing the spec doesn't have an official reference parser/tokenizer, or an associated document with example input and the expected tokenized output.

The closest thing to a reference parser is probably parse-css, written by one of the authors of the spec. Crass's test suite also includes these comprehensive parsing tests by another author of the spec.

I've waffled back to thinking a formatter is a purely Sanitize-side problem, since with a customizable formatter key under the css config section, it can substitute the user-provided formatter for Crass::Parser.stringify.

Given that the original goal was to collapse unnecessary whitespace, are you sure a custom stringifier is necessary? I was imagining that you'd just remove unnecessary whitespace nodes from the parse tree, then pass the modified parse tree to Crass::Parser.stringify.

Side question, I have a bad habit of getting sidetracked into cleaning up code. Is that something that's worth submitting a PR for?

Thanks for asking! Please don't submit code-cleanup PRs unless they fix actual bugs. At this point, both Crass and Sanitize are mature projects that are essentially in maintenance mode (meaning I'm interested in fixing bugs, but I don't plan to add significant new features or do any unnecessary refactoring).

Fustrate commented 6 years ago

The closest thing to a reference parser is probably parse-css, written by one of the authors of the spec.

Aha, that's exactly where I found the little output testing site I linked to, so that's good to know.

Given that the original goal was to collapse unnecessary whitespace, are you sure a custom stringifier is necessary?

Although similar, a transformer and a formatter really are two separate things, and whitespace management tends to fall into the latter category. You're never going to apply two different "whitespace transformers" to a stylesheet, whereas you could have transformers for collapsing rules, removing duplicate properties, adding/removing prefixes, alphabetizing properties, etc. A formatter would be responsible for newlines, spaces<->tabs (and how many), a bit of capitalization, single or double colons, and things of that nature that are strictly presentation.

Please don't submit code-cleanup PRs unless they fix actual bugs.

Exactly why I prefer to ask. I find code styling somewhat therapeutic, but sometimes it's best to leave working code as it is.