openeduhub / metalookup

Provide metadata about domains w.r.t accessibility, licencing, adds, etc.
GNU General Public License v3.0
5 stars 0 forks source link

Replace Splash with Playwright #156

Closed MRuecklCC closed 2 years ago

MRuecklCC commented 2 years ago

To improve response times and get a more complete rendered HTML it would make sense to replace Splash with Playwright.

Best case scenario would be a one to one replacement, where only the docker container and communication with splash gets replaced. And all that remains is renaming SplashResponse to PlaywrightResponse .

MRuecklCC commented 2 years ago

See: https://issues.edu-sharing.net/jira/browse/KBMBF-539

MRuecklCC commented 2 years ago

The playwright python package does not allows to directly export a har file. However, there is a 3rd party python package (https://github.com/ninoseki/playwright-har-tracer) which uses the playwright python api to register on the relevant events (network traffice etc) and manually populates a har document on the python end.

This package looks ok, but is rather unmaintained and maybe not 100% trustworthy :-/