propublica / politwoops_sunlight

Politwoops web front end
Other
44 stars 30 forks source link

Recent deletions missing screenshots... #35

Closed nickom closed 7 years ago

nickom commented 9 years ago

The status page is showing that the screenshot worker is running, but I can't figure out why we missed images for these links - maybe something hidden in the logs, @handlers?:

http://politwoops.sunlightfoundation.com/tweet/523932179626467328

http://politwoops.sunlightfoundation.com/tweet/523621873184022528

http://politwoops.sunlightfoundation.com/tweet/523499451366010881

http://politwoops.sunlightfoundation.com/tweet/523261427671629824

timball commented 9 years ago

huh . okay so of the one's you cited here's a sample http request chain:

http://politwoops.sunlightfoundation.com/tweet/523499451366010881 cites http://t.co/sX8GXz2Hye

timball@thompson {144}$ curl -I http://t.co/sX8GXz2Hye
HTTP/1.1 301 Moved Permanently
cache-control: private,max-age=300
content-length: 0
date: Mon, 20 Oct 2014 21:33:26 UTC
expires: Mon, 20 Oct 2014 21:38:26 GMT
location: http://twitter.com/JoyceBeatty/status/523499451366010881/photo/1
server: tsa_b
set-cookie: muc=5f7365e8-3efd-4fac-8e3b-a934022e0459; Expires=Sat, 01 Oct 2016 21:33:26 GMT; Domain=t.co
x-connection-hash: 5f7a268e01787f2fa320d59335314d62

then follow the 301 to http://twitter.com/JoyceBeatty/status/523499451366010881/photo/1

timball@thompson {145}$ curl -I http://twitter.com/JoyceBeatty/status/523499451366010881/photo/1
HTTP/1.1 301 Moved Permanently
content-length: 0
date: Mon, 20 Oct 2014 21:34:01 UTC
location: https://twitter.com/JoyceBeatty/status/523499451366010881/photo/1
server: tsa_a
set-cookie: guest_id=v1%3A141384084136511954; Domain=.twitter.com; Path=/; Expires=Wed, 19-Oct-2016 21:34:01 UTC
x-connection-hash: 9ad3bfcf4a3c24b59a727a3effc96fc8

and the final request

timball@thompson {146}$ curl -I https://twitter.com/JoyceBeatty/status/523499451366010881/photo/1
HTTP/1.1 404 Not Found
cache-control: no-cache, no-store, must-revalidate, pre-check=0, post-check=0
content-length: 4311
content-security-policy-report-only: default-src 'none'; img-src https://abs.twimg.com https://ssl.google-analytics.com; script-src https://abs.twimg.com https://ssl.google-analytics.com about:; style-src https://abs.twimg.com 'unsafe-inline'; font-src https://abs.twimg.com https://twitter.com;connect-src 'none'; object-src 'none'; media-src 'none'; frame-src 'none'; report-uri https://twitter.com/i/csp_report?a=ORTGK%3D%3D%3D&ro=false
content-type: text/html;charset=utf-8
date: Tue, 21 Oct 2014 00:31:47 UTC
expires: Tue, 31 Mar 1981 05:00:00 GMT
last-modified: Tue, 21 Oct 2014 00:31:47 GMT
ms: A
pragma: no-cache
server: tsa_b
set-cookie: _twitter_sess=BAh7CSIKZmxhc2hJQzonQWN0aW9uQ29udHJvbGxlcjo6Rmxhc2g6OkZsYXNo%250ASGFzaHsABjoKQHVzZWR7ADoPY3JlYXRlZF9hdGwrCArtHTBJAToMY3NyZl9p%250AZCIlZTJiZTZmNWU3MmViMTE4ODVjMDgzMzY3Nzk5NmQ5NDY6B2lkIiU3ZGY3%250AMDBhMzUwZGUxMmIwNzA2NjI1NDljOTM3MWI4MA%253D%253D--beb4dd312eba550ed10931e340ed2dcae3b0856a; Path=/; Domain=.twitter.com; Secure; HTTPOnly
set-cookie: guest_id=v1%3A141385150797735042; Domain=.twitter.com; Path=/; Expires=Thu, 20-Oct-2016 00:31:47 UTC
status: 404 Not Found
strict-transport-security: max-age=631138519
x-connection-hash: de62c37d6d87f1910ccf558e49dc8b27
x-content-type-options: nosniff
x-frame-options: SAMEORIGIN
x-transaction: 2512ba6424313489
x-ua-compatible: IE=edge,chrome=1
x-xss-protection: 1; mode=block
plantfansam commented 9 years ago

This was likely an issue with the worker, which has been (hopefully) fixed here: https://github.com/sunlightlabs/politwoops-tweet-collector/commit/f369e55157f8ef27e7782146ac77c4b8f1ccbb5c