Closed GuilloOme closed 7 years ago
After investigating, it seems that phantomjs is not happy with playing on the localhost network… I filled a bug #14808 to them.
It's seems a behavior of QT (the lib used by phantomJS) ; unfortunately, it's not be "fixable"… (see this response)
When launching a crawl, it seems that only the start url and robots.txt are requested through the proxy (during the validation process).
way to reproduce:
$ ./htcap.py crawl -v -p http:127.0.0.1:8080 http://localhost/index.html test.db
you get:Crawl finished, 3 pages analyzed in 0 minutes