phimage / Erik

Erik is an headless browser based on WebKit. An headless browser allow to run functional tests, to access and manipulate webpages using javascript.
http://phimage.github.io/Erik/
MIT License
595 stars 47 forks source link

Disable CSS and images #48

Closed noamalffasy closed 5 years ago

noamalffasy commented 5 years ago

Is it possible to not download CSS and images to speed up the crawling?

phimage commented 5 years ago

but you need javascript and its execution?

if not, you do not need this framework, but just URLRequest (or network request library such as Alamofire) and a parser like tid-kijyun/Kanna (that I use here)

If yes, I see any configuration on WebKit to deactivate that There is stopLoading but it will stop download javascript too Then there is a delegate on WKWebView, maybe you decide inside what to crawl or not https://developer.apple.com/documentation/webkit/wknavigationdelegate

noamalffasy commented 5 years ago

I actually wanted to disable only CSS and images because it could download less files and make the app load info faster. I know that it's possible in other browsers (using puppeteer for example). If it isn't supported natively then is it possible to block URLs? That way I'll be able to block URLs that end with .png or any other file extension that indicates an image or css

phimage commented 5 years ago

has I say Erik use WebKit, apple framework. https://www.hackingwithswift.com/example-code/wkwebview/how-to-control-the-sites-a-wkwebview-can-visit-using-wknavigationdelegate but I do not remember if resources like images and css call this delegate methods