sensepost / gowitness

🔍 gowitness - a golang, web screenshot utility using Chrome Headless
GNU General Public License v3.0
2.87k stars 329 forks source link

Chromedp Timed Context #129

Closed randomactsofsecurity closed 2 years ago

randomactsofsecurity commented 2 years ago

This PR helps address issues raised in https://github.com/sensepost/gowitness/issues/116 where gowitness was failing to 'complete' it's tasks. If you monitor some long running jobs, you'll notice instances of chrome with idle time that exceeds any timeout options. No timeout was set on the chromedp process, so I've added that in.

Additionally, it follows a pattern where we run the chrome browser, then open a new context as a "tab". We try to navigate on this first context, with a specific timeout set so that the context will close gracefully, and if we can navigate and complete all the resource loading in the timeout time, we take a screenshot.

However, if we exceed the timeout, we don't want to actually return an error because the page may have actually loaded, and what we're waiting on is some random resources that don't load. So instead, we create a second context with it's own timeout, and just attempt to capture the screenshot as is.

There is a chance this introduces more blank 'white' screenshots, if you fail to truly load a page, but I don't have a good way of signaling that as an error. I think this is an acceptable trade-off, and w/ perception hashing they all get filtered out anyways.

Lastly, I added a couple of browser crash fixes using the inspector library, one additional reason that the process would be 'stuck open' was that a browser crash occured and chromedp did not handle it gracefully. So monitoring each context for a crash helps resolve this issue (a good example right now is www.dupont.com, throw that in a file with some other items and you should observe a crash if you monitor the chromedp log).

ghost commented 2 years ago

Conflicts with master have been resolved in #132.

leonjza commented 2 years ago

Implemented in #132. Thanks so much @randomactsofsecurity & @rtpt-jonaslieb !