Going deeper on custom-property usage

LeaVerou commented 4 years ago

I see last year we only calculated how many websites use custom properties (5%).

This year, I would like to go a lot deeper. E.g.:

[x] What are the most popular property names? (via custom metric or AST)
[x] What percentage of values are set only, used in var() only, or both? (via custom metric)
[x] Which values are most popular? (per WG resolution here) (via custom metric or AST)
[x] What do they use them for? Colors and fonts? Numbers? Strings? Custom datatypes? (via custom metric)
[x] Which properties do they use them in? (via custom metric or AST)
[x] Which functions do they use them in? (via custom metric or AST)
[x] How do they use them?
- [x] Do they take advantage of the cascade, or just on the :root like static variables? (via custom metric)
- [x] Do they take advantage of the reactivity (setting them via JS, or overriding in pseudo-classes)? (via custom metric or AST)
[x] Do they use the fallback parameter? (via custom metric or AST)
[x] Do they use them in conjunction with @supports? If so, what property name do they test for? (via AST)
[x] What is the depth of the dependency graph? Are there any cycles? (via custom metric)
[x] How frequently are variables explicitly reset to initial? (via custom metric or AST)

LeaVerou commented 4 years ago

A few thoughts on how we could calculate these metrics:

Most popular property names can easily be determined from the Rework AST, by counting distinct property names that start with --. Not sure if we should take times used in the stylesheet into account and weigh them higher if they are used more times. Also, if we adapt Greg's script to count custom properties, this may be even easier, since the count will already be there.
Same as above for values, though we may want to do some kind of normalization first
Where are CSS properties most used? Count property names that include var() calls and exclude custom properties. Similarly, count functions that contain a var() call.
What are CSS properties used for? The metric above will answer this to some extent. Another approach would be to try and parse custom property values, or regexp match for common types (e.g. for colors we could color function names, hex, named colors etc), though that would be inaccurate in cases where a variable is used for part of a type, e.g. the hue in hsl() colors. But it would still be useful to see what % are numbers, strings, whole colors etc
Are they set in JS? This is trickier but both parsed_css and summary_response_bodies contain a url field, so we can join on that to find JS and CSS that are applied to the same page. Or maybe we don't need to, since often custom properties are set in JS and used in JS. Maybe we should just search JS for something like \.style\.setProperty\(['"]--|.css\(\{?['"]--. Of course it misses all the various helpers that may be used to set properties, but perhaps we can still get enough data. Would be good to extract the custom property names and measure most popular property names that are set via JS, and see how it differs from those we found in CSS.
Do they use the fallback parameter? Search values for var\(.+?\) to find all uses of var() (Greg's script could be useful here too), how many matches contain a comma? What are the most popular fallbacks?
@supports is easy, something like (see pen):

for (let rule of ast.stylesheet.rules) {
    if (rule.type === "supports" && rule.supports.indexOf("(--") > -1) {
        let [,property, value] = rule.supports.match(/\(--(\w+)(?::\s*(.+?))?\)/);
        console.log({property, value});
    }
}

How frequently are variables explicitly reset to initial? This should also be easy, we just iterate, and count declarations where the property starts with -- and the value is initial
What is the depth of the dependency graph? Are there any cycles? This is the trickiest of all, but also most illuminating. As a bare minimum we can easily detect cycles where a variable is referencing itself in the same declaration, but going beyond that and building a dependency graph to measure depth is tricky due to the way they cascade, so it's very complicated to figure this out from the parsed CSS. Perhaps we could walk the DOM and read the computed style, which will more reliably tell us what the dependencies are, but that is O(N) on the number of elements. 😢 I'm going to continue thinking on this one.

dooman87 commented 4 years ago

I'm going to start writing queries for this one if nobody picked this up yet @LeaVerou ?

LeaVerou commented 4 years ago

@dooman87 I was actually hoping to do this one myself, but perhaps we could collaborate? E.g. I can write the JS part and you write the SQL part?

dooman87 commented 4 years ago

@dooman87 I was actually hoping to do this one myself, but perhaps we could collaborate? E.g. I can write the JS part and you write the SQL part?

All good, I can pickup the other one or we can collaborate, but I could be a bit slow :) Let me know what you think would be the most productive.

LeaVerou commented 4 years ago

I tried to write a script today to create a dependency graph based on the computed styles of the various elements in the DOM. Unfortunately, it proved out harder than expected, since:

a) There seems to be no way to tell if a variable value is inherited or not (we can compare with the parent value, but it can have false positives) b) Most importantly, by the time we read the computed style, var() references have already been resolved and there seems to be no way to recover them.

We could traverse document.styleSheets to find the specified values, but there's no way to relate them to actual elements unless we re-implement the cascade.

One workaround/horrible hack I can think of is to traverse document.styleSheets and add a custom property for each property (e.g. --almanac-background-color) that includes a var(). This custom property would mirror the property value except for substituting the var() with another name (e.g. almanac-var()), so that it doesn't get resolved in the computed style. Let me know if anyone can a) see why this wouldn't work b) have a better idea, otherwise I'm going to implement it and see how it works.

LeaVerou commented 4 years ago

Alright, this was painful but I've made progress on this and have pushed a proof of concept. You can try it out by going to any live website, and running this in the console:

import('https://leaverou.github.io/css-almanac/runtime/var-tree.js')

I have currently set up the code to print a nicely formatted JSON object in the console, for easier debugging.

For example, this is what it generates for smashingmagazine.com. summary contains a summary of variable declarations picked up from stylesheets & inline styles, and computed contains a dependency graph based on the computed style. In most websites the data structure should be relatively compact, but it can potentially explode for websites using CSS variables on very liberal selectors like * or common type selectors. But since those are few, it should be ok?

This data structure can help us answer questions like:

Depth of dependency graph, and cycles
Which properties are CSS variables used in?
Inside what media queries and @support queries are CSS variables used in?
What kind of selectors are CSS variables used on?
How many CSS variables are used in inline styles?
How many websites use the Houdini Properties & Values API? (#3)

And many others.

One issue I haven't managed to resolve is that when the Properties & Values API is used, that property appears everywhere due to having an initial value (see --number here: https://leaverou.github.io/css-almanac/runtime/var-tree.html ).

@rviscomi A few questions:

Should I precompute some of the metrics and add them as extra properties?
Short of pasting the above import statement in various websites and seeing what it returns, is there a better way to debug?
Since I'm adding new custom properties, do I need to "cleanup" after and remove them so that they don't affect other stats? Or does it not matter because everything runs separately?

rviscomi commented 4 years ago

Should I precompute some of the metrics and add them as extra properties?

Depends on the value of the raw data. If there are many ways to slice and dice it to get different insights, keeping the raw JSON and analyzing it in SQL SGTM. Otherwise precomputing metrics is a good idea.

Short of pasting the above import statement in various websites and seeing what it returns, is there a better way to debug?

You should inline the var-tree.js contents into the custom metric script itself. This will avoid any CORS/CSP issues when we run it in the HTTP Archive crawl.

Since I'm adding new custom properties, do I need to "cleanup" after and remove them so that they don't affect other stats? Or does it not matter because everything runs separately?

That's a great question, I'm not sure. Could you make a clone and modify that as needed, rather than the global/shared object?

LeaVerou commented 4 years ago

Depends on the value of the raw data. If there are many ways to slice and dice it to get different insights, keeping the raw JSON and analyzing it in SQL SGTM. Otherwise precomputing metrics is a good idea.

Oh the JSON definitely is high value and should be kept, I was just asking if I should precompute some metrics in addition to it.

You should inline the var-tree.js contents into the custom metric script itself. This will avoid any CORS/CSP issues when we run it in the HTTP Archive crawl.

Will do! How urgently do I need to do that? I would like to iterate a bit more if possible, in case I can fix that annoying Properties & Values bug.

Could you make a clone and modify that as needed, rather than the global/shared object?

I'm afraid not, the way this works is it adds custom properties that mirror properties with var() references, after replacing var() with a different name, so we can retrieve it pre-substitution. Then it finds these elements, and reads their computed style. Not sure how to do that in a clone efficiently. Should I add cleanup code or do you want to find out if it's needed first?

rviscomi commented 4 years ago

Will do! How urgently do I need to do that? I would like to iterate a bit more if possible, in case I can fix that annoying Properties & Values bug

ASAP. Custom metrics were due yesterday and PRs are being reviewed now to make it into the August crawl.

Should I add cleanup code or do you want to find out if it's needed first?

I'm not very familiar with the possible effects of adding new custom properties on other metrics. If the CSS chapter metrics are the only ones looking at custom properties, I think it should be safe?

LeaVerou commented 4 years ago

ASAP. Custom metrics were due yesterday and PRs are being reviewed now to make it into the August crawl.

Ok, I'm staying up tonight to get both this and the Sass one in, as I've made lots of good progress on that too. Hopefully submitting the PRs around noon EST should still be ok?

I'm not very familiar with the possible effects of adding new custom properties on other metrics. If the CSS chapter metrics are the only ones looking at custom properties, I think it should be safe?

Cool. If there were another metric looking at runtime custom props I'd be aware of it, so let's leave the cleanup out for now.

rviscomi commented 4 years ago

SGTM thanks for accelerating this. We need to get it merged by EOD today.

LeaVerou commented 4 years ago

Just saw this, on it.

rviscomi commented 3 years ago

Can we mark this one as Has JS?

LeaVerou commented 3 years ago

Done! Finally all metrics have JS! 🎉 @rviscomi could you update the "Has SQL" column? I think you've written queries for a few things that I see in the "Needs SQL" column

rviscomi commented 3 years ago

Some topics like this one have many queries, which I've only partially implemented so far. I'll be sure to Has SQL anything that is totally complete. The checklist in https://github.com/HTTPArchive/almanac.httparchive.org/pull/1332 is a good source of truth for the metrics that have been implemented already.

rviscomi commented 3 years ago

I think I've implemented everything that has JS, but TBH I'm not sure if that covers all the questions you had about custom properties. @LeaVerou can you check the results sheet and let me know if there's anything missing and where I could find its corresponding JS?

LeaVerou commented 3 years ago

@rviscomi Did you notice js/01-var.js? There is a component to this that is AST-based.

rviscomi commented 3 years ago

Ahh I missed some of the other props in there. Could you help me understand how you intended they be aggregated (supports, pseudo-classes, fallback, initial)?

For example, I can sum up the value of initial, but not sure what to divide it by. Is there a "total number of custom properties" value, ie maybe the sum of all properties object values?

rviscomi commented 3 years ago

I'm also unsure what you're looking to measure for these two metrics under the "Custom Properties" section:

Constants
Custom properties in JS

Could you point me to any other detailed descriptions and/or JS?

LeaVerou commented 3 years ago

Constants is covered by popular values I suppose.

Custom properties in JS, is whether people set CSS variables from JS. Another component to see if people take advantage of the reactivity, or just use them like preprocessor variables, along with things like pseudo-classes etc. I don't see it being measured anywhere right now and I'm not sure how to measure it without access to the JS.

If we can query the JS, I suppose we could look for things like .setProperty(["']--.

rviscomi commented 3 years ago

Ok great, so we can check off "constants" with the Top Custom Property Values results.

Custom properties in JS doesn't have a custom metric to parse the JS so unfortunately this one isn't feasible at this stage.

I'll update the analysis PR accordingly.

LeaVerou commented 3 years ago

@rviscomi Another way to figure this out is if there are custom properties in the computed style that are not found in the stylesheets. However, this approach has a few drawbacks:

False positives: The summary in the custom metric only covers stylesheets that are not cross-origin, so any cross-origin stylesheets defining custom properties would show up as if they were set via JS. Not sure if we can query the AST and the custom metric together…
False negatives: If a custom property is set conditionally via JS and is not set at the time the custom metric ran, it won't be counted.

Thoughts?

rviscomi commented 3 years ago

Given the drawbacks, I'd punt on this metric.

LeaVerou / css-almanac

Going deeper on custom-property usage #1