Open nar001 opened 5 years ago
I'm running into the same issue on the latest release on MacOS. It'll index and then stall, eating up CPU like crazy. This is the additional information available:
http://localhost:54292
just lists a JSON file:
{"/live": {"modes": ["list_sources", "index", "resource"]}, "/live/postreq": {"modes": ["list_sources", "index", "resource"]}, "/extract": {"modes": ["list_sources", "index", "resource"]}, "/extract/postreq": {"modes": ["list_sources", "index", "resource"]}, "/replay": {"modes": ["list_sources", "index", "resource"]}, "/replay/postreq": {"modes": ["list_sources", "index", "resource"]}, "/replay-coll": {"modes": ["list_sources", "index", "resource"]}, "/replay-coll/postreq": {"modes": ["list_sources", "index", "resource"]}, "/patch": {"modes": ["list_sources", "index", "resource"]}, "/patch/postreq": {"modes": ["list_sources", "index", "resource"]}}
The size of WARC file I'm trying to open is 5,01 GB. Please let me know if you need any additional information :)
Please try the 1.8.0 release. We've made some improvements to large WARC indexing and should work much better.
I have the same issue with (small) HAR files and the 1.8.0 release (MacOS).
The progress bar is at 100%.
Extra Debug Info is:
Created user local with the email test@localhost and the role: 'public-archivist'
ERROR PARSING: /path/to/file.har
'pages'
WARCSERVER_HOST=http://localhost:52971
skip {'name': 'Admin', 'description': 'Admin API'}
skip {'name': 'Stats', 'description': 'Stats API'}
skip {'name': 'Automation', 'description': 'Automation API'}
APP_HOST=http://localhost:52972
The page on http://localhost:52972 shows the message "Almost Done!" and a progress bar on 100%. It seems, that everything is finished but something else fails …
I made some more tests. With HAR files from Safari developer toolbar there seems to be no problem. A simple website (like "hello world" without any other files) is OK and this Github Page here is also OK.
But a HAR file saved with the firefox developer toolbar has the problem described above. Even the real simple HAR fails.
So I'm trying to load a WARC file, it goes to 100% and then tells me it stalled. Navigating to the interface says "Almost Done!" but never goes farther. A long time ago, I used to be able to browse it, but now it doesn't work and I'm not quite sure why. Thanks!