Open nwohaibi opened 9 years ago
Hi @nwohaibi,
Thanks!
I just wanted to know if there is any way to disable browser caching of files?
There is a way to do it in QWebKit (see http://doc.qt.io/qt-4.8/qnetworkrequest.html#CacheLoadControl-enum), but currently this option is not exposed by Splash. It is a good feature to have, but we need to design a public API for it and implement it.
Or maybe return all HTTP requests made in har/log/entries, not just the ones with 200 http status ?
HAR entries already contain all HTTP requests, not just the ones with 200 http status code. In case of cache some records may be missing because they are not requested at all. It should be possible to add them to the output as well, but I haven't checked the details; implementation may be not so straightforward.
Thanks for taking the time to clarify :)
Since I already have Splash in production, i might tackle the issue by modifying cache-control headers in HTTP responses. This way, WebKit would assume all resources are not to be cached.
let me know if I can be of any help
and thanks again
Hi @kmike
There is a way to do it in QWebKit (see http://doc.qt.io/qt-4.8/qnetworkrequest.html#CacheLoadControl-enum), but currently this option is not exposed by Splash.
I used to believe that, and I even tried to make a PR that way. However later I realized that it is not the case. (Proved by local testings)
The QNetworkRequest::CacheLoadControl
attribute shall be set for request instances, and it is Qt's network manager to decide whether to use a disk cache. However in the current implement of splash
, caching in the network managers is not enabled at all (please check https://github.com/scrapinghub/splash/blob/master/splash/network_manager.py#L42)
As WebKit also has its own in-memory cache (for scripts, stylesheets, images, etc.), that is believed to be the real cause. In some specific scenarios it's required to strictly disable any kind of caching. Thus I made PR #339 for this.
Hi, Thanks for the wonderful work on Spalsh
I just wanted to know if there is any way to disable browser caching of files? Or maybe return all HTTP requests made in har/log/entries, not just the ones with 200 http status ?
thanks in advance