catchpoint / WebPageTest.agent

Cross-platform WebPageTest agent
Other
213 stars 138 forks source link

Update Wappalyzer and Chrome feature mappings #606

Closed pmeenan closed 1 year ago

pmeenan commented 1 year ago

Monthly Wappalyzer update (just definitions, no engine changes) and updated the list of Chrome feature numbers.

NOTE: This looks like it includes a license change to GPLv3 (which is actually a fix of a mis-licensing issue) so worth double-checking that there are no concerns.

pmeenan commented 1 year ago

Might be worth holding off on this merge since there is no other GPLv3 code in the agent. Might be worth considering if the "use" of Wappalyzer should be re-architected so that it is called and used externally as needed on systems where people choose to install it so that it is not part of the actual code and doesn't cause viral license problems.

Regular use is likely not a problem since the agent code is never "distributed" (which is why WPT switched to polyform for competing-use protection) but it is probably worth separating out just to be sure that the agent can continue to use a Polyform license cleanly.

tkadlec commented 1 year ago

@pmeenan Been thinking about this for other reasons. We have some things like this...Axe, Wappalyzer, CrUX....that happen after the test and are kind of separate that probably don't need to be in the core agent, and maybe just shouldn't (and we've got a couple more planned).

Need to figure out what that approach would look like. :/

pmeenan commented 1 year ago

Axe and Wappalyzer (and anything JS-based) that runs at the end of the test could probably be split out as custom metrics (even if it's a "special" kind of custom metric that the agent manages).

The agent already supports a local directory of custom metrics that we use for the HTTP Archive so that the metric code doesn't have to be sent with the job on every test. It would be fairly trivial to add support for a similar directory of metrics that are pulled from separate repos but that the agent manages (or just use the existing mechanism as part of the install script).

CrUX is a bit more complicated since it runs python code locally as part of the fetch/processing. I'm not sure a modular approach is needed unless we start to have several cases of that but it could potentially be built as support for python modules in a directory with a well-known interface and at the end of a test, the agent could just call into each of them with a "self" reference so the code could post-process whatever test data it needed to without having to know before-hand what it would need.