cloudflare / lol-html

Low output latency streaming HTML parser/rewriter with CSS selector-based API
https://crates.io/crates/lol-html
BSD 3-Clause "New" or "Revised" License
1.46k stars 80 forks source link

Rename all tokens to rewritable units, rename lexeme to token #16

Closed inikulin closed 10 months ago

inikulin commented 4 years ago

Current terminology might be a bit confusing. So, let's rename tokens to rewritable units (StartTag will be still a rewritable unit, just not exposed in the public API).

This allows us to rename lexeme to token.

nox commented 3 years ago

In the current code there are both "rewritable units", "token" and "lexeme", did we stop midway through such a rename?

inikulin commented 3 years ago

@nox no, these are all different things. The intention of this ticket is to simplify API and get rid of term "lexeme". Currently full parser produces lexemes, then if they are captured by a selector they converted to "tokens" with rewriting API exposed. In most cases tokens is the same thing as rewritable units with an exception to Element which is a combination of start tag token, end tag token and inner content.

The idea is to rename "lexeme"s to "token"s to make things more conventional. And name "rewritable units" what currently named "tokens", with "Element" just being a compound rewritable unit.