webrecorder / webrecorder-player

Webrecorder Player for Desktop (OSX/Windows/Linux). (Built with Electron + Webrecorder)
Apache License 2.0
426 stars 39 forks source link

Records Not Showing #33

Closed Saklad5 closed 5 years ago

Saklad5 commented 7 years ago

Only the first 500 records in WARC files appear when using WebRecorder Player 1.0.4 on macOS. I’ve opened the same file with the deprecated webarchiveplayer application, and by actually taking apart the WARC file, and there are clearly thousands of records past that. Is this an intentionally hard-coded limit? If so, why?

m4rk3r commented 7 years ago

Hey @Saklad5, right now there is a limit of 500 bookmarks, mostly due to frontend performance issues. It's a temporary solution that we're planing to fix in the near future.

If you're interested in working around that, would be happy to point out where that is if you want to try building the player without that limit.

Saklad5 commented 7 years ago

I’m currently using webarchiveplayer to view larger WARC files. If you feel a modified build of WebRecorder Player would be usable enough to improve on that, I’d appreciate you telling me what to change.

Speaking of larger files and performance issues, is CDX support planned? Should I open an issue for that instead of bringing it up here?

ikreymer commented 7 years ago

Can you tell us more about your use case? What is the size of WARCs you are generally working with? The limit was definitely meant to be temporary, and if you know the URLs you can still enter the directly. The limit is set here: https://github.com/webrecorder/webrecorder/blob/master/webrecorder/webrecorder/uploadcontroller.py#L353 when detecting which pages are to be added as bookmarks. Our main use case for the Webrecorder Player was WARCs created with Webrecorder, which rarely exceeds that amount in a single WARC. But we do want to eliminate the need for this limit.

What do you mean by 'CDX support'? Access to CDX Server API? Or providing CDX files separately? The answer is probably 'yes', but please open a separate issue with your question :)

ikreymer commented 5 years ago

The 500 page limit has been removed in the latest versions of Webrecorder Player (1.5.0 and up)