Open simonw opened 7 months ago
One thing this can be useful for is taking screenshots of pages that detect and block headless Chrome. They seem to often do that by looking for navigator.webdriver
.
https://www.news.com.au/ is an example:
shot-scraper https://www.news.com.au/ -h 600
But using the prototype from https://github.com/simonw/shot-scraper/commit/fae9babee52fc109c643501dd74cb9f75d18d19b and a tip from https://stackoverflow.com/a/75771301/6083
shot-scraper https://www.news.com.au/ -h 600 \
--init-script 'delete Object.getPrototypeOf(navigator).webdriver' \
--user-agent 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:124.0) Gecko/20100101 Firefox/124.0'
Asked ChatGPT for more ideas of things to do with init scripts: https://chat.openai.com/share/71c5302f-bb92-4bd8-8eb3-311d855311b0
A few that I really liked
browser_context.add_init_script("""
Date.now = function() { return new Date('2024-01-01T00:00:00Z').getTime(); };
""")
browser_context.add_init_script("""
const originalFetch = window.fetch;
window.fetch = async function(...args) {
if (args[0].includes('api.example.com')) {
return new Response(JSON.stringify({ mocked: true }), { status: 200 });
}
return originalFetch(...args);
};
""")
browser_context.add_init_script("""
localStorage.setItem('key', 'value');
document.cookie = 'name=value; path=/';
""")
Claude 3 Opus suggested "Simulate a specific device":
page.add_init_script("""
Object.defineProperty(window, 'innerWidth', {
writable: true,
configurable: true,
value: 375,
});
Object.defineProperty(window, 'innerHeight', {
writable: true,
configurable: true,
value: 812,
});
""")
Init scripts are special JavaScript that gets run to prime the page before the URL is loaded:
https://playwright.dev/python/docs/api/class-page#page-add-init-script
This should be an option for
shot
andjavascript
and more.