palewire / savepagenow

A simple Python wrapper and command-line interface for archive.org’s "Save Page Now" capturing service
https://palewi.re/docs/savepagenow/
MIT License
167 stars 23 forks source link

Indicate if a cached version of the page is returned #3

Closed palewire closed 7 years ago

palewire commented 7 years ago

Here are the archive.org headers on what I believe to be a cached response. I suspect the clue is somewhere in here.

{'X-Archive-Orig-Connection': 'close', 'X-Archive-Orig-Content-Type': 'text/html', 'Transfer-Encoding': 'chunked', 'X-Archive-Orig-Accept-Ranges': 'bytes', 'X-Archive-Orig-Content-Length': '1270', 'X-Archive-Orig-Date': 'Tue, 18 Oct 2016 20:48:54 GMT', 'X-Archive-Orig-Etag': '"359670651+gzip"', 'X-Archive-Orig-x-ec-custom-error': '1', 'X-Archive-Orig-X-Cache': 'HIT', 'X-Archive-Guessed-Charset': 'utf-8', 'Date': 'Tue, 18 Oct 2016 20:49:51 GMT', 'X-Archive-Orig-Cache-Control': 'max-age=604800', 'X-Archive-Orig-Vary': 'Accept-Encoding', 'X-Archive-Playback': '1', 'X-Archive-Orig-Expires': 'Tue, 25 Oct 2016 20:48:54 GMT', 'set-cookie': 'wayback_server=24; Domain=archive.org; Path=/; Expires=Thu, 17-Nov-16 20:49:51 GMT;, JSESSIONID=6AE3A28003F983D640B0B393FB27315B; Path=/; HttpOnly', 'X-Archive-Orig-Server': 'ECS (rhv/818F)', 'Server': 'Tengine/2.1.0', 'Connection': 'keep-alive', 'X-Archive-Orig-Last-Modified': 'Fri, 09 Aug 2013 23:54:35 GMT', 'Content-Location': '/web/20161018204854/http://www.example.com/', 'Content-Encoding': 'gzip', 'X-Page-Cache': 'HIT', 'Content-Type': 'text/html;charset=utf-8'}

palewire commented 7 years ago

Here's one where I believe the page was archived new. Notice that the X-Page-Cache header is MISS. Above, it is HIT.

{'X-Archive-Orig-Connection': 'close', 'X-Archive-Orig-Content-Type': 'text/html; charset=utf-8', 'Transfer-Encoding': 'chunked', 'X-Archive-Orig-Accept-Ranges': 'bytes', 'X-Archive-Orig-X-Varnish': '5124236', 'X-Archive-Orig-Date': 'Tue, 18 Oct 2016 20:46:40 GMT', 'X-Archive-Orig-Transfer-Encoding': 'chunked', 'X-Archive-Orig-grace': 'none', 'X-Archive-Orig-Via': '1.1 varnish-v4', 'Date': 'Tue, 18 Oct 2016 20:52:18 GMT', 'X-Archive-Orig-X-Varnish-TTL': '180s', 'X-Archive-Playback': '1', 'X-Archive-Guessed-Charset': 'utf-8', 'Set-Cookie': 'JSESSIONID=C4B83FACB234F75164F9BFC8D7DAFCC8; Path=/; HttpOnly', 'X-Archive-Orig-Server': 'Apache/2.4.10 (Ubuntu) mod_wsgi/3.5 Python/2.7.8', 'Server': 'Tengine/2.1.0', 'Connection': 'keep-alive', 'X-Archive-Orig-X-Varnish-Cache': 'HIT', 'Content-Location': '/web/20161018205218/http://palewi.re/posts/2016/04/18/pulitzer-pride/', 'Content-Encoding': 'gzip', 'X-Archive-Orig-Age': '0', 'X-Page-Cache': 'MISS', 'Content-Type': 'text/html;charset=utf-8'}

palewire commented 7 years ago

I believe this is done.