<<THIS REPOSITORY IS DEPRECATED>> The HTTP Archive provides information about website performance such as # of HTTP requests, use of gzip, and amount of JavaScript. This information is recorded over time revealing trends in how the Internet is performing. Built using Open Source software, the code and data are available to everyone allowing researchers large and small to work from a common base.
The custom properties added are removed at the end. This means that the code is now idempotent; it can run multiple times on the same website with the same results. It also means that if any other metrics traverse document.styleSheets they will not be affected if this runs first.
Duplicate siblings in the same obj.children array are collapsed into one object, with a times property indicating how many they were. Non-contiguous siblings are also collapsed, so this is a bit lossy, but it does not affect any of the things we may want to study in this data structure. This change reduced the size of the tree generated on my own website by 40%, though the savings were less impressive on other websites (basically the more generic the selectors that use custom properties, the bigger the savings).
The algorithm does recursively call JSON.stringify() on every single object in the graph, so there's a certain tradeoff of memory vs computation, but in practice it seems to run pretty fast, even on large trees.
This PR does two things:
document.styleSheets
they will not be affected if this runs first.obj.children
array are collapsed into one object, with atimes
property indicating how many they were. Non-contiguous siblings are also collapsed, so this is a bit lossy, but it does not affect any of the things we may want to study in this data structure. This change reduced the size of the tree generated on my own website by 40%, though the savings were less impressive on other websites (basically the more generic the selectors that use custom properties, the bigger the savings). The algorithm does recursively callJSON.stringify()
on every single object in the graph, so there's a certain tradeoff of memory vs computation, but in practice it seems to run pretty fast, even on large trees.Progress on https://github.com/HTTPArchive/almanac.httparchive.org/issues/898