Open simonw opened 2 years ago
Just to add an additional example — Playwright does a version of this behind the scenes itself. It "injects" helper scripts into every page.
Plugins would be great. Especially if that makes it possible to detect if an image already exists or to not save a page if a 404 or other status code is detected.
Tweet: https://twitter.com/simonw/status/1514657436287705119
I want to be able to use tricks like this one - where Readability.js is injected into a page - without relying on CDNs: https://til.simonwillison.net/shot-scraper/readability
One option would be to package things like this up as plugins using Pluggy (as seen in Datasette) - then serve the JavaScript assets using a
/-/shot-scraper-xyz/plugins/...
route configured using https://playwright.dev/python/docs/api/class-page#page-route