N0taN3rd / node-warc

Parse And Create Web ARChive (WARC) files with node.js
MIT License
92 stars 20 forks source link

Some WARCs lack bookmarks according to Webrecorder player #25

Closed peterk closed 5 years ago

peterk commented 5 years ago

Archived the following URL: https://www.facebook.com/socialdemokraternailjusdal/photos/a.1409766429240263/2084031858480380/?type=3&__xts__%5B0%5D=68.ARA3blS6QVatnljfKg2ED3yFCSVs2fEjWVC085o9H1oNPpiSDeld4Iu5HfWS59RvuteqLBXXBZZj0oN9I8r0S7RxjC_W77aYdiOtyPeaVCRfYm0O1rgzzqnYDIZTXDJEYPG-XJ0dpOoaGR8JI0JbP6NPCTYXaKKEPUUUKg1XihsVouag0W91ra3-Rqr-TpDrPm96rVOvgjIy8oe5Kse0ZV50kJ65pwWhKvBxm7bMoyTo1fsXAkK6sYdaM_iQhbT7PO25qk6VUbbrTlHSu5i7a3idF2huVM4KM7s-LaOZMPztninlNYMFjCjbJpOeK8wgNUrcXdzwPLsS3iYZ4-D4RcYwPSsU&__tn__=-R

Opening the resulting WARC in Webrecorder Player shows "No bookmarks available in the table":

image

Maybe the URL is too long or not escaped properly? Other (shorter) URLs seem to work fine.

Example zipped WARC file below.

fbtest.warc.zip

peterk commented 5 years ago

I am thinking this is a bug in Webrecorder player as it opens fine in openwayback.

N0taN3rd commented 5 years ago

This is definitively more of an issue with page detection rather than can play back. Blind page discovery is hard... But we are looking into how to solve this issue. Ref comment by @ikreymer in https://github.com/webrecorder/webrecorder-player/issues/77