webrecorder / pywb

Core Python Web Archiving Toolkit for replay and recording of web archives
https://pypi.python.org/pypi/pywb
GNU General Public License v3.0
1.34k stars 207 forks source link

Not replaying XHR POST request from legacy collection #861

Closed despens closed 1 month ago

despens commented 10 months ago

Somewhere before pywb version 2.3.1 a particular POST stopped replaying. It is related to old instagram captures.

At the moment, the web archive cannot be replayed in pywb 2.3.1 and 2.7.4, but works in Conifer.

Steps to reproduce the bug

The same web archive is available at 3 different deployments

Status deployment pywb URL
Conifer 2.5.0 https://conifer.rhizome.org/despens/amalia-ulman-excellences--perfections/list/four-personas/b1/20141014150552/http://instagram.com/amaliaulman
Webenact 2.3.1 https://webenact.rhizome.org/excellences-and-perfections/20141014150552/http://instagram.com/amaliaulman
Rhizome Webarchives 2.7.4 https://webarchives.rhizome.org/excellences-and-perfections/20141014150552/http://instagram.com/amaliaulman

Click on any of the instagram photos from the grid to trigger the POST request, for instance this one: Screenshot from 2023-08-25 12-53-17

HAR logs

Conifer ```json { "log": { "version": "1.2", "creator": { "name": "Firefox", "version": "117.0" }, "browser": { "name": "Firefox", "version": "117.0" }, "pages": [ { "id": "page_1", "pageTimings": { "onContentLoad": -533266, "onLoad": -533261 }, "startedDateTime": "2023-08-25T12:54:24.168+02:00", "title": "https://conifer.rhizome.org/despens/amalia-ulman-excellences--perfections/list/four-personas/b1/20141014150552/http://instagram.com/amaliaulman" } ], "entries": [ { "startedDateTime": "2023-08-25T12:54:24.168+02:00", "request": { "bodySize": 596, "method": "POST", "url": "https://cones.conifer.rhizome.org/despens/amalia-ulman-excellences--perfections/list/four-personas/b1/20141014150552mp_/http://instagram.com/query/", "httpVersion": "HTTP/2", "headers": [ { "name": "Host", "value": "cones.conifer.rhizome.org" }, { "name": "User-Agent", "value": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/117.0" }, { "name": "Accept", "value": "application/json, text/javascript, */*; q=0.01" }, { "name": "Accept-Language", "value": "en-US,en;q=0.7,de-DE;q=0.3" }, { "name": "Accept-Encoding", "value": "gzip, deflate, br" }, { "name": "X-Pywb-Requested-With", "value": "XMLHttpRequest" }, { "name": "Content-Type", "value": "application/x-www-form-urlencoded; charset=UTF-8" }, { "name": "X-Instagram-AJAX", "value": "1" }, { "name": "X-CSRFToken", "value": "deleted" }, { "name": "X-Requested-With", "value": "XMLHttpRequest" }, { "name": "Content-Length", "value": "596" }, { "name": "Origin", "value": "https://cones.conifer.rhizome.org" }, { "name": "Connection", "value": "keep-alive" }, { "name": "Referer", "value": "https://cones.conifer.rhizome.org/despens/amalia-ulman-excellences--perfections/list/four-personas/b1/20141014150552mp_/http://instagram.com/p/s67XD2FV5l/?modal=true" }, { "name": "Cookie", "value": "__wr_sesh=WyJhVlNoekY2dWRSK1UxZUxnZ0xTcURyNHlCaDA9IixmYWxzZV0.ZOiGUw.C9L1H2HfgzeqRKN95kVzpLrgCW0" }, { "name": "Sec-Fetch-Dest", "value": "empty" }, { "name": "Sec-Fetch-Mode", "value": "cors" }, { "name": "Sec-Fetch-Site", "value": "same-origin" }, { "name": "Pragma", "value": "no-cache" }, { "name": "Cache-Control", "value": "no-cache" } ], "cookies": [ { "name": "__wr_sesh", "value": "WyJhVlNoekY2dWRSK1UxZUxnZ0xTcURyNHlCaDA9IixmYWxzZV0.ZOiGUw.C9L1H2HfgzeqRKN95kVzpLrgCW0" } ], "queryString": [], "headersSize": 1061, "postData": { "mimeType": "application/x-www-form-urlencoded", "params": [ { "name": "q", "value": "ig_shortcode(s67XD2FV5l) { id, code, owner { id, username, is_private, profile_pic_url, followed_by_viewer, requested_by_viewer },is_video, video_url, shared_by_author, date, display_src, WB_wombat_location { name }, caption, usertags { nodes { user { username }, position }}, likes { count, viewer_has_liked, nodes { user { username, profile_pic_url, followed_by_viewer, requested_by_viewer }}}, comments.last(20) { nodes { id, user { username, profile_pic_url }, text, viewer_can_delete }}}" } ], "text": "q=ig_shortcode(s67XD2FV5l)+%7B+id%2C+code%2C+owner+%7B+id%2C+username%2C+is_private%2C+profile_pic_url%2C+followed_by_viewer%2C+requested_by_viewer+%7D%2Cis_video%2C+video_url%2C+shared_by_author%2C+date%2C+display_src%2C+WB_wombat_location+%7B+name+%7D%2C+caption%2C+usertags+%7B+nodes+%7B+user+%7B+username+%7D%2C+position+%7D%7D%2C+likes+%7B+count%2C+viewer_has_liked%2C+nodes+%7B+user+%7B+username%2C+profile_pic_url%2C+followed_by_viewer%2C+requested_by_viewer+%7D%7D%7D%2C+comments.last(20)+%7B+nodes+%7B+id%2C+user+%7B+username%2C+profile_pic_url+%7D%2C+text%2C+viewer_can_delete+%7D%7D%7D" } }, "response": { "status": 200, "statusText": "OK", "httpVersion": "HTTP/2", "headers": [ { "name": "server", "value": "nginx/1.17.9" }, { "name": "date", "value": "Fri, 25 Aug 2023 10:54:24 GMT" }, { "name": "content-type", "value": "application/json" }, { "name": "x-archive-orig-cache-control", "value": "private, no-cache, no-store, must-revalidate" }, { "name": "x-archive-orig-content-encoding", "value": "gzip" }, { "name": "content-language", "value": "en" }, { "name": "x-archive-orig-date", "value": "Tue, 14 Oct 2014 16:29:19 GMT" }, { "name": "x-archive-orig-expires", "value": "Sat, 01 Jan 2000 00:00:00 GMT" }, { "name": "x-archive-orig-pragma", "value": "no-cache" }, { "name": "x-archive-orig-server", "value": "nginx" }, { "name": "set-cookie", "value": "csrftoken=deleted; Path=/despens/amalia-ulman-excellences--perfections/list/four-personas/b1/20141014150552mp_/http://instagram.com/" }, { "name": "x-archive-orig-vary", "value": "Cookie, Accept-Language, Accept-Encoding" }, { "name": "x-archive-orig-content-length", "value": "909" }, { "name": "x-archive-orig-connection", "value": "keep-alive" }, { "name": "content-security-policy", "value": "default-src 'unsafe-eval' 'unsafe-inline' 'self' data: blob: mediastream: ws: wss: conifer.rhizome.org/_set_session; form-action 'self'" }, { "name": "strict-transport-security", "value": "max-age=31536000" }, { "name": "X-Firefox-Spdy", "value": "h2" } ], "cookies": [ { "name": "csrftoken", "value": "deleted" } ], "content": { "mimeType": "application/json", "size": 3048, "text": "{\"status\":\"ok\",\"code\":\"s67XD2FV5l\",\"shared_by_author\":true,\"usertags\":{\"nodes\":[]},\"owner\":{\"username\":\"amaliaulman\",\"requested_by_viewer\":false,\"followed_by_viewer\":false,\"profile_pic_url\":\"http:\\/\\/photos-a.ak.instagram.com\\/hphotos-ak-xaf1\\/10724811_680031252095464_259338633_a.jpg\",\"id\":\"202871366\",\"is_private\":false},\"comments\":{\"nodes\":[{\"text\":\"black and white roses\",\"viewer_can_delete\":false,\"id\":\"809233848265432742\",\"user\":{\"username\":\"yiming2014\",\"profile_pic_url\":\"http:\\/\\/images.ak.instagram.com\\/profiles\\/profile_1027541052_75sq_1393581182.jpg\"}},{\"text\":\"\\ud83d\\udc4f\\ud83d\\udc4f\\ud83d\\udc9f\",\"viewer_can_delete\":false,\"id\":\"809380233082134177\",\"user\":{\"username\":\"annasoldner\",\"profile_pic_url\":\"http:\\/\\/photos-f.ak.instagram.com\\/hphotos-ak-xfa1\\/10643885_239528626220653_744988609_a.jpg\"}}]},\"caption\":\"\\ud83d\\udc99\",\"likes\":{\"count\":130,\"viewer_has_liked\":false,\"nodes\":[{\"user\":{\"username\":\"michellerawlings\",\"profile_pic_url\":\"http:\\/\\/photos-h.ak.instagram.com\\/hphotos-ak-xap1\\/10362315_509364185856223_1822410738_a.jpg\"}},{\"user\":{\"username\":\"oscarsaurus_rex\",\"requested_by_viewer\":false,\"profile_pic_url\":\"http:\\/\\/photos-d.ak.instagram.com\\/hphotos-ak-xaf1\\/10610974_756432981085155_2039221624_a.jpg\",\"followed_by_viewer\":false}},{\"user\":{\"username\":\"leahschrager\",\"requested_by_viewer\":false,\"profile_pic_url\":\"http:\\/\\/images.ak.instagram.com\\/profiles\\/profile_1280121550_75sq_1399326744.jpg\",\"followed_by_viewer\":false}},{\"user\":{\"username\":\"hannahthoughts\",\"requested_by_viewer\":false,\"profile_pic_url\":\"http:\\/\\/images.ak.instagram.com\\/profiles\\/profile_423257681_75sq_1371768205.jpg\",\"followed_by_viewer\":false}},{\"user\":{\"username\":\"marilynschneider\",\"requested_by_viewer\":false,\"profile_pic_url\":\"http:\\/\\/photos-c.ak.instagram.com\\/hphotos-ak-xpf1\\/10349626_239240692932746_1425483956_a.jpg\",\"followed_by_viewer\":false}},{\"user\":{\"username\":\"tictactoy\",\"requested_by_viewer\":false,\"profile_pic_url\":\"http:\\/\\/images.ak.instagram.com\\/profiles\\/profile_37452681_75sq_1372076997.jpg\",\"followed_by_viewer\":false}},{\"user\":{\"username\":\"chisenhalegallery\",\"requested_by_viewer\":false,\"profile_pic_url\":\"http:\\/\\/photos-d.ak.instagram.com\\/hphotos-ak-xpa1\\/10467860_310275525815555_931978272_a.jpg\",\"followed_by_viewer\":false}},{\"user\":{\"username\":\"ninandkris\",\"requested_by_viewer\":false,\"profile_pic_url\":\"http:\\/\\/photos-a.ak.instagram.com\\/hphotos-ak-xaf1\\/10617006_1475234602725696_1208035371_a.jpg\",\"followed_by_viewer\":false}},{\"user\":{\"username\":\"booksandwine_\",\"requested_by_viewer\":false,\"profile_pic_url\":\"http:\\/\\/photos-h.ak.instagram.com\\/hphotos-ak-xaf1\\/10661281_360850720743983_1963563015_a.jpg\",\"followed_by_viewer\":false}},{\"user\":{\"username\":\"simsimsakhai\",\"requested_by_viewer\":false,\"profile_pic_url\":\"http:\\/\\/images.ak.instagram.com\\/profiles\\/profile_1154100479_75sq_1394153158.jpg\",\"followed_by_viewer\":false}}]},\"date\":1410686581.0,\"is_video\":false,\"id\":\"809220152487140965\",\"display_src\":\"http:\\/\\/photos-g.ak.instagram.com\\/hphotos-ak-xaf1\\/10608019_719252001482622_659224595_n.jpg\"}" }, "redirectURL": "", "headersSize": 926, "bodySize": 3974 }, "cache": {}, "timings": { "blocked": 0, "dns": 1, "connect": 180, "ssl": 200, "send": 0, "wait": 286, "receive": 0 }, "time": 667, "_securityState": "secure", "serverIPAddress": "54.164.112.170", "connection": "443", "pageref": "page_1" } ] } } ```
Webenact ```json { "log": { "version": "1.2", "creator": { "name": "Firefox", "version": "117.0" }, "browser": { "name": "Firefox", "version": "117.0" }, "pages": [ { "id": "page_1", "pageTimings": { "onContentLoad": -146122, "onLoad": -143630 }, "startedDateTime": "2023-08-25T13:03:01.894+02:00", "title": "https://webenact.rhizome.org/excellences-and-perfections/20141014150552/http://instagram.com/amaliaulman" } ], "entries": [ { "startedDateTime": "2023-08-25T13:03:01.894+02:00", "request": { "bodySize": 596, "method": "POST", "url": "https://webenact.rhizome.org/excellences-and-perfections/20141014150552/http://instagram.com/query/", "httpVersion": "HTTP/1.1", "headers": [ { "name": "Host", "value": "webenact.rhizome.org" }, { "name": "User-Agent", "value": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/117.0" }, { "name": "Accept", "value": "application/json, text/javascript, */*; q=0.01" }, { "name": "Accept-Language", "value": "en-US,en;q=0.7,de-DE;q=0.3" }, { "name": "Accept-Encoding", "value": "gzip, deflate, br" }, { "name": "X-Pywb-Requested-With", "value": "XMLHttpRequest" }, { "name": "Content-Type", "value": "application/x-www-form-urlencoded; charset=UTF-8" }, { "name": "X-Instagram-AJAX", "value": "1" }, { "name": "X-CSRFToken", "value": "deleted" }, { "name": "X-Requested-With", "value": "XMLHttpRequest" }, { "name": "Content-Length", "value": "596" }, { "name": "Origin", "value": "https://webenact.rhizome.org" }, { "name": "Connection", "value": "keep-alive" }, { "name": "Referer", "value": "https://webenact.rhizome.org/excellences-and-perfections/20141014150552/http://instagram.com/p/s67XD2FV5l/?modal=true" }, { "name": "Sec-Fetch-Dest", "value": "empty" }, { "name": "Sec-Fetch-Mode", "value": "cors" }, { "name": "Sec-Fetch-Site", "value": "same-origin" }, { "name": "Pragma", "value": "no-cache" }, { "name": "Cache-Control", "value": "no-cache" } ], "cookies": [], "queryString": [], "headersSize": 854, "postData": { "mimeType": "application/x-www-form-urlencoded", "params": [ { "name": "q", "value": "ig_shortcode(s67XD2FV5l) { id, code, owner { id, username, is_private, profile_pic_url, followed_by_viewer, requested_by_viewer },is_video, video_url, shared_by_author, date, display_src, WB_wombat_location { name }, caption, usertags { nodes { user { username }, position }}, likes { count, viewer_has_liked, nodes { user { username, profile_pic_url, followed_by_viewer, requested_by_viewer }}}, comments.last(20) { nodes { id, user { username, profile_pic_url }, text, viewer_can_delete }}}" } ], "text": "q=ig_shortcode(s67XD2FV5l)+%7B+id%2C+code%2C+owner+%7B+id%2C+username%2C+is_private%2C+profile_pic_url%2C+followed_by_viewer%2C+requested_by_viewer+%7D%2Cis_video%2C+video_url%2C+shared_by_author%2C+date%2C+display_src%2C+WB_wombat_location+%7B+name+%7D%2C+caption%2C+usertags+%7B+nodes+%7B+user+%7B+username+%7D%2C+position+%7D%7D%2C+likes+%7B+count%2C+viewer_has_liked%2C+nodes+%7B+user+%7B+username%2C+profile_pic_url%2C+followed_by_viewer%2C+requested_by_viewer+%7D%7D%7D%2C+comments.last(20)+%7B+nodes+%7B+id%2C+user+%7B+username%2C+profile_pic_url+%7D%2C+text%2C+viewer_can_delete+%7D%7D%7D" } }, "response": { "status": 404, "statusText": "Not Found", "httpVersion": "HTTP/1.1", "headers": [ { "name": "Server", "value": "nginx/1.14.0 (Ubuntu)" }, { "name": "Date", "value": "Fri, 25 Aug 2023 11:03:02 GMT" }, { "name": "Content-Type", "value": "text/html" }, { "name": "Transfer-Encoding", "value": "chunked" }, { "name": "Connection", "value": "keep-alive" }, { "name": "Content-Encoding", "value": "gzip" } ], "cookies": [], "content": { "mimeType": "text/html", "size": 720, "text": "\n\n\n \n URL Not Found\n \n \n \n\n\n
\n
\n

URL Not Found

\n

\n The url http://instagram.com/query/ could not be found in this collection.\n

\n
\n
\n\n" }, "redirectURL": "", "headersSize": 195, "bodySize": 583 }, "cache": {}, "timings": { "blocked": 0, "dns": 2, "connect": 111, "ssl": 117, "send": 0, "wait": 124, "receive": 0 }, "time": 354, "_securityState": "secure", "serverIPAddress": "35.245.250.198", "connection": "443", "pageref": "page_1" } ] } } ```
Rhizome Webarchives ```json { "log": { "version": "1.2", "creator": { "name": "Firefox", "version": "117.0" }, "browser": { "name": "Firefox", "version": "117.0" }, "pages": [ { "id": "page_1", "pageTimings": { "onContentLoad": -10519, "onLoad": -6630 }, "startedDateTime": "2023-08-25T13:05:11.629+02:00", "title": "https://webarchives.rhizome.org/excellences-and-perfections/20141014150552/http://instagram.com/amaliaulman" } ], "entries": [ { "startedDateTime": "2023-08-25T13:05:11.629+02:00", "request": { "bodySize": 586, "method": "POST", "url": "https://webarchives.rhizome.org/excellences-and-perfections/20141014150552mp_/http://instagram.com/query/", "httpVersion": "HTTP/1.1", "headers": [ { "name": "Host", "value": "webarchives.rhizome.org" }, { "name": "User-Agent", "value": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/117.0" }, { "name": "Accept", "value": "application/json, text/javascript, */*; q=0.01" }, { "name": "Accept-Language", "value": "en-US,en;q=0.7,de-DE;q=0.3" }, { "name": "Accept-Encoding", "value": "gzip, deflate, br" }, { "name": "X-Pywb-Requested-With", "value": "XMLHttpRequest" }, { "name": "Content-Type", "value": "application/x-www-form-urlencoded; charset=UTF-8" }, { "name": "X-Instagram-AJAX", "value": "1" }, { "name": "X-CSRFToken", "value": "deleted" }, { "name": "X-Requested-With", "value": "XMLHttpRequest" }, { "name": "Content-Length", "value": "586" }, { "name": "Origin", "value": "https://webarchives.rhizome.org" }, { "name": "Connection", "value": "keep-alive" }, { "name": "Referer", "value": "https://webarchives.rhizome.org/excellences-and-perfections/20141014150552mp_/http://instagram.com/p/s67XD2FV5l/?modal=true" }, { "name": "Cookie", "value": "csrftoken=deleted; __utma=1.164060185.1413299153.1413299153.1413299153.1; __utmb=1.1.10.1413299153; __utmc=1; __utmz=1.1413299153.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __utmt=1" }, { "name": "Sec-Fetch-Dest", "value": "empty" }, { "name": "Sec-Fetch-Mode", "value": "cors" }, { "name": "Sec-Fetch-Site", "value": "same-origin" }, { "name": "Pragma", "value": "no-cache" }, { "name": "Cache-Control", "value": "no-cache" } ], "cookies": [ { "name": "csrftoken", "value": "deleted" }, { "name": "__utma", "value": "1.164060185.1413299153.1413299153.1413299153.1" }, { "name": "__utmb", "value": "1.1.10.1413299153" }, { "name": "__utmc", "value": "1" }, { "name": "__utmz", "value": "1.1413299153.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)" }, { "name": "__utmt", "value": "1" } ], "queryString": [], "headersSize": 1068, "postData": { "mimeType": "application/x-www-form-urlencoded", "params": [ { "name": "q", "value": "ig_shortcode(s67XD2FV5l) { id, code, owner { id, username, is_private, profile_pic_url, followed_by_viewer, requested_by_viewer },is_video, video_url, shared_by_author, date, display_src, location { name }, caption, usertags { nodes { user { username }, position }}, likes { count, viewer_has_liked, nodes { user { username, profile_pic_url, followed_by_viewer, requested_by_viewer }}}, comments.last(20) { nodes { id, user { username, profile_pic_url }, text, viewer_can_delete }}}" } ], "text": "q=ig_shortcode(s67XD2FV5l)+%7B+id%2C+code%2C+owner+%7B+id%2C+username%2C+is_private%2C+profile_pic_url%2C+followed_by_viewer%2C+requested_by_viewer+%7D%2Cis_video%2C+video_url%2C+shared_by_author%2C+date%2C+display_src%2C+location+%7B+name+%7D%2C+caption%2C+usertags+%7B+nodes+%7B+user+%7B+username+%7D%2C+position+%7D%7D%2C+likes+%7B+count%2C+viewer_has_liked%2C+nodes+%7B+user+%7B+username%2C+profile_pic_url%2C+followed_by_viewer%2C+requested_by_viewer+%7D%7D%7D%2C+comments.last(20)+%7B+nodes+%7B+id%2C+user+%7B+username%2C+profile_pic_url+%7D%2C+text%2C+viewer_can_delete+%7D%7D%7D" } }, "response": { "status": 404, "statusText": "Not Found", "httpVersion": "HTTP/1.1", "headers": [ { "name": "Server", "value": "nginx/1.18.0 (Ubuntu)" }, { "name": "Date", "value": "Fri, 25 Aug 2023 11:05:11 GMT" }, { "name": "Content-Type", "value": "text/html" }, { "name": "Transfer-Encoding", "value": "chunked" }, { "name": "Connection", "value": "keep-alive" }, { "name": "Content-Encoding", "value": "gzip" } ], "cookies": [], "content": { "mimeType": "text/html", "size": 1093, "text": "\n\n \n \n \n\n URL Not Found\n\n \n\n\n\n\n\n \n\n \n
\n
\n \n
\n
\n
\n

URL Not Found

\n
\n

\n The url http://instagram.com/query/ could not be found in this collection.

\n
\n
\n\n \n" }, "redirectURL": "", "headersSize": 195, "bodySize": 668 }, "cache": {}, "timings": { "blocked": 0, "dns": 0, "connect": 0, "ssl": 0, "send": 0, "wait": 124, "receive": 0 }, "time": 124, "_securityState": "secure", "serverIPAddress": "35.236.219.133", "connection": "443", "pageref": "page_1" } ] } } ```

Archive data

WARC ```WARC WARC/1.0 WARC-Type: response WARC-Record-ID: WARC-Date: 2014-10-14T16:29:23Z WARC-Target-URI: http://instagram.com/query/ WARC-IP-Address: 54.236.170.22 Content-Type: application/http;msgtype=response Content-Length: 1366 WARC-Block-Digest: sha1:MX6AUYZCF6D5ADTOBAXKWCIP7Q4UPWJT WARC-Payload-Digest: sha1:EQP327EORTS2QTCXGOS54KCDO6X47LBB HTTP/1.1 200 OK Cache-Control: private, no-cache, no-store, must-revalidate Content-Encoding: gzip Content-Language: en Content-Type: application/json Date: Tue, 14 Oct 2014 16:29:19 GMT Expires: Sat, 01 Jan 2000 00:00:00 GMT Pragma: no-cache Server: nginx Set-Cookie: csrftoken=deleted; expires=Tue, 13-Oct-2015 16:29:19 GMT; Max-Age=31449600; Path=/ Vary: Cookie, Accept-Language, Accept-Encoding Content-Length: 909 Connection: keep-alive WARC/1.0 WARC-Type: request WARC-Record-ID: WARC-Date: 2014-10-14T16:29:23Z WARC-Target-URI: http://instagram.com/query/ WARC-Concurrent-To: WARC-Block-Digest: sha1:KKIVWVB3DRWYSW2WI2UROXAP7UP6BQ7W Content-Type: application/http;msgtype=request Content-Length: 1334 POST /query/ HTTP/1.1 x-csrftoken: deleted content-length: 596 accept-language: en-US,en;q=0.5 accept-encoding: gzip, deflate referer: http://instagram.com/p/ruEFWFlV10/?modal=true x-instagram-ajax: 1 accept: application/json, text/javascript, */*; q=0.01 user-agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:32.0) Gecko/20100101 Firefox/32.0 connection: keep-alive cookie: pywb.timestamp=20141014162919; csrftoken=deleted; mid=VD1PMgAEAAF9EQNZDMyG22AWoP1a; __utma=deleted; __utmb=deleted; __utmc=deleted; __utmz=deleted; __utmt=deleted; pywb_timestamp=deleted pragma: no-cache cache-control: no-cache host: instagram.com x-requested-with: XMLHttpRequest content-type: application/x-www-form-urlencoded; charset=UTF-8 q=ig_shortcode(s67XD2FV5l)+%7B+id%2C+code%2C+owner+%7B+id%2C+username%2C+is_private%2C+profile_pic_url%2C+followed_by_viewer%2C+requested_by_viewer+%7D%2Cis_video%2C+video_url%2C+shared_by_author%2C+date%2C+display_src%2C+WB_wombat_location+%7B+name+%7D%2C+caption%2C+usertags+%7B+nodes+%7B+user+%7B+username+%7D%2C+position+%7D%7D%2C+likes+%7B+count%2C+viewer_has_liked%2C+nodes+%7B+user+%7B+username%2C+profile_pic_url%2C+followed_by_viewer%2C+requested_by_viewer+%7D%7D%7D%2C+comments.last(20)+%7B+nodes+%7B+id%2C+user+%7B+username%2C+profile_pic_url+%7D%2C+text%2C+viewer_can_delete+%7D%7D%7D ```
CDXj ``` com,instagram)/query?q=ig_shortcode(s67xd2fv5l)%20{%20id,%20code,%20owner%20{%20id,%20username,%20is_private,%20profile_pic_url,%20followed_by_viewer,%20requested_by_viewer%20},is_video,%20video_url,%20shared_by_author,%20date,%20display_src,%20location%20{%20name%20},%20caption,%20usertags%20{%20nodes%20{%20user%20{%20username%20},%20position%20}},%20likes%20{%20count,%20viewer_has_liked,%20nodes%20{%20user%20{%20username,%20profile_pic_url,%20followed_by_viewer,%20requested_by_viewer%20}}},%20comments.last(20)%20{%20nodes%20{%20id,%20user%20{%20username,%20profile_pic_url%20},%20text,%20viewer_can_delete%20}}} 20141014162923 {"url":"http://instagram.com/query/","mime":"application/json","status":"200","digest":"EQP327EORTS2QTCXGOS54KCDO6X47LBB","length":"1656","offset":"11394396","filename":"excellences-and-perfections_desktop-p3.warc.gz"} ```

Testing with GET

The resource is always available when queried via GET:

Conifer: https://conifer.rhizome.org/despens/amalia-ulman-excellences--perfections/http://instagram.com/query/?q=ig_shortcode(s67XD2FV5l)+%7B+id%2C+code%2C+owner+%7B+id%2C+username%2C+is_private%2C+profile_pic_url%2C+followed_by_viewer%2C+requested_by_viewer+%7D%2Cis_video%2C+video_url%2C+shared_by_author%2C+date%2C+display_src%2C+WB_wombat_location+%7B+name+%7D%2C+caption%2C+usertags+%7B+nodes+%7B+user+%7B+username+%7D%2C+position+%7D%7D%2C+likes+%7B+count%2C+viewer_has_liked%2C+nodes+%7B+user+%7B+username%2C+profile_pic_url%2C+followed_by_viewer%2C+requested_by_viewer+%7D%7D%7D%2C+comments.last(20)+%7B+nodes+%7B+id%2C+user+%7B+username%2C+profile_pic_url+%7D%2C+text%2C+viewer_can_delete+%7D%7D%7D

Webenact: https://webenact.rhizome.org/excellences-and-perfections/http://instagram.com/query/?q=ig_shortcode(s67XD2FV5l)+%7B+id%2C+code%2C+owner+%7B+id%2C+username%2C+is_private%2C+profile_pic_url%2C+followed_by_viewer%2C+requested_by_viewer+%7D%2Cis_video%2C+video_url%2C+shared_by_author%2C+date%2C+display_src%2C+WB_wombat_location+%7B+name+%7D%2C+caption%2C+usertags+%7B+nodes+%7B+user+%7B+username+%7D%2C+position+%7D%7D%2C+likes+%7B+count%2C+viewer_has_liked%2C+nodes+%7B+user+%7B+username%2C+profile_pic_url%2C+followed_by_viewer%2C+requested_by_viewer+%7D%7D%7D%2C+comments.last(20)+%7B+nodes+%7B+id%2C+user+%7B+username%2C+profile_pic_url+%7D%2C+text%2C+viewer_can_delete+%7D%7D%7D

Rhizome Webarchives: https://webarchives.rhizome.org/excellences-and-perfections/http://instagram.com/query/?q=ig_shortcode(s67XD2FV5l)+{+id%2C+code%2C+owner+{+id%2C+username%2C+is_private%2C+profile_pic_url%2C+followed_by_viewer%2C+requested_by_viewer+}%2Cis_video%2C+video_url%2C+shared_by_author%2C+date%2C+display_src%2C+WB_wombat_location+{+name+}%2C+caption%2C+usertags+{+nodes+{+user+{+username+}%2C+position+}}%2C+likes+{+count%2C+viewer_has_liked%2C+nodes+{+user+{+username%2C+profile_pic_url%2C+followed_by_viewer%2C+requested_by_viewer+}}}%2C+comments.last(20)+{+nodes+{+id%2C+user+{+username%2C+profile_pic_url+}%2C+text%2C+viewer_can_delete+}}}

Summary

This seems to be a change in pywb behavior that breaks replay for some POST requests.

cc @m4rk3r @mona-ul

mona-ul commented 8 months ago

We found the solution for the issue: We simply had to reindex (wb-manager reindex) the collection. The collection is now fully functional with pywb 2.7.4 on Rhizome Webarchives: https://webarchives.rhizome.org/excellences-and-perfections/20141014150552/http://instagram.com/amaliaulman

The previous index.cdxj for the collection was created in May 2019, and it was compatible with the pywb version at that time. And it seems that somewhere along the way, the POST Requests were handled differently by newer pywb versions, causing a mismatch between pywb and the existing index. This mismatch resulted in the malfunction of the web archive.

Is this analysis correct?

Index entry of Post Request, Mai 2019, pywb version unknown com,instagram)/query?q=ig_shortcode(s67xd2fv5l)%20{%20id,%20code,%20owner%20{%20id,%20username,%20is_private,%20profile_pic_url,%20followed_by_viewer,%20requested_by_viewer%20},is_video,%20video_url,%20shared_by_author,%20date,%20display_src,%20location%20{%20name%20},%20caption,%20usertags%20{%20nodes%20{%20user%20{%20username%20},%20position%20}},%20likes%20{%20count,%20viewer_has_liked,%20nodes%20{%20user%20{%20username,%20profile_pic_url,%20followed_by_viewer,%20requested_by_viewer%20}}},%20comments.last(20)%20{%20nodes%20{%20id,%20user%20{%20username,%20profile_pic_url%20},%20text,%20viewer_can_delete%20}}} 20141014162923 {"url":"http://instagram.com/query/","mime":"application/json","status":"200","digest":"EQP327EORTS2QTCXGOS54KCDO6X47LBB","length":"1656","offset":"11394396","filename":"excellences-and-perfections_desktop-p3.warc.gz"}

Index entry of Post Request, Nov 2023, pywb version 2.7.4 - with "requestBody" com,instagram)/query?__wb_method=post&q=ig_shortcode(s67xd2fv5l)%20{%20id,%20code,%20owner%20{%20id,%20username,%20is_private,%20profile_pic_url,%20followed_by_viewer,%20requested_by_viewer%20},is_video,%20video_url,%20shared_by_author,%20date,%20display_src,%20location%20{%20name%20},%20caption,%20usertags%20{%20nodes%20{%20user%20{%20username%20},%20position%20}},%20likes%20{%20count,%20viewer_has_liked,%20nodes%20{%20user%20{%20username,%20profile_pic_url,%20followed_by_viewer,%20requested_by_viewer%20}}},%20comments.last(20)%20{%20nodes%20{%20id,%20user%20{%20username,%20profile_pic_url%20},%20text,%20viewer_can_delete%20}}} 20141014162923 {"url": "http://instagram.com/query/", "mime": "application/json", "status": "200", "digest": "EQP327EORTS2QTCXGOS54KCDO6X47LBB", "length": "1656", "offset": "11394396", "method": "POST", "requestBody": "q=ig_shortcode(s67XD2FV5l) { id, code, owner { id, username, is_private, profile_pic_url, followed_by_viewer, requested_by_viewer },is_video, video_url, shared_by_author, date, display_src, WB_wombat_location { name }, caption, usertags { nodes { user { username }, position }}, likes { count, viewer_has_liked, nodes { user { username, profile_pic_url, followed_by_viewer, requested_by_viewer }}}, comments.last(20) { nodes { id, user { username, profile_pic_url }, text, viewer_can_delete }}}", "filename": "excellences-and-perfections_desktop-p3.warc.gz"}

despens commented 1 month ago

This issue was solved by re-indexing the collection.