Open kiler129 opened 1 year ago
To open a large WARC, the application process the entire file. We have the WACZ format to solve this problem, which precompute the index, and packages it together with the WARC. WACZ files have scaled to over 1TB so should definitely work. Are you able to run a command-line tool to convert WARC->WACZ? If so, we have py-wacz and now also js-wacz which can do so on the command-line. Then, the file will open fairly quickly and you'll be able to search through and replay right away.
I'm sorry for a non-descriptive title, but there's nothing more specific I can really say.
I attempted to load archive from https://archive.org/download/archiveteam_liveleak_20210506071950_2a306039 and it crashes every single time after loading ~3.5GB. I tried opening DevTools and the last message printed is "Read 93000" records. Promptly after dev tools disconnect ("DevTools was disconnected from the page. ...").
I'm running the offline version on macOS v13.2.1 on M1 Max with 32GB memory. The memory pressure is low.