ikreymer / webarchiveplayer

NOTE: This project is no longer being actively developed.. Check out Webrecorder Player for the latest player. https://github.com/webrecorder/webrecorderplayer-electron) (Legacy: Desktop application for browsing web archives (WARC and ARC)
GNU General Public License v3.0
195 stars 20 forks source link

Getting 'permanently moved' results on 200/OK WARC entries #22

Open christianleger opened 8 years ago

christianleger commented 8 years ago

I'm getting 'permanently moved to here' results when loading up a WARC I made. The word 'here' is a link, and when I click it, it just reloads the same 'permanently moved' page. This would make sense if the URL in question had a result of 30x in my WARC, however I'm getting this for some 200/OK request/responses in my WARC. Any idea why this would be?

Thank you for your time.

christianleger commented 8 years ago

One more observation: I was able to view the record in replay, instead of getting the redirect, by changing the timestamp. It seems like if the response record has the same timestamp as the request record, then I get the redirect!

christianleger commented 8 years ago

More observation: changing record times seems to help sometimes, but not reliably so.

Overall, although the WARC standard says all records (in a given capture) need to have the same timestamp, giving all records the same timestamp often results in 'permanently moved' pages, while providing different timestamps occasionally makes pages readable.

machawk1 commented 8 years ago

@ruggy, can you upload a WARC that exhibits this behavior on replay?