SiteOne Crawler is a cross-platform website crawler and analyzer for SEO, security, accessibility, and performance optimization—ideal for developers, DevOps, QA engineers, and consultants. Supports Windows, macOS, and Linux (x64 and arm64).
GOAL : allow the crawler to handle complex authentication scenarios
Options
a) In-tool: build a dedicated panel to allow the user to manually configure or import an already configured request header to re-use an already browser authenticated connection.
b) External tool integration: build integration with Postman or similar tools (i.e. BurpSuite extension)
c) implement a selenium script runner: this would require three parts 1) running chromium in headless mode using a user-defined selenium script generated by one of the browser extensions publicly available (i.e. Qualys Browser Recorder, Selenium Recorder, etc.); 2) capture the session cookies and pass them to the crawler; 3) a way for the crawler to intercept logout/logoff triggers; 4) an internal routine that allows the crawler to repeat the authentication process if a logout/logoff trigger gets activated invalidating the current session cookies/identifiers
GOAL : allow the crawler to handle complex authentication scenarios
Options
a) In-tool: build a dedicated panel to allow the user to manually configure or import an already configured request header to re-use an already browser authenticated connection.
b) External tool integration: build integration with Postman or similar tools (i.e. BurpSuite extension)
c) implement a selenium script runner: this would require three parts 1) running chromium in headless mode using a user-defined selenium script generated by one of the browser extensions publicly available (i.e. Qualys Browser Recorder, Selenium Recorder, etc.); 2) capture the session cookies and pass them to the crawler; 3) a way for the crawler to intercept logout/logoff triggers; 4) an internal routine that allows the crawler to repeat the authentication process if a logout/logoff trigger gets activated invalidating the current session cookies/identifiers