webrecorder / archiveweb.page

A High-Fidelity Web Archiving Extension for Chrome and Chromium based browsers!
https://chrome.google.com/webstore/detail/webrecorder/fpeoodllldobpkbkabpblcfaogecpndd
GNU Affero General Public License v3.0
784 stars 57 forks source link

Document where the desktop application stores data #77

Open despens opened 2 years ago

despens commented 2 years ago

I am going to work with a number of larger collections that I want to store on a external drive. At what path does the desktop application store data? Ideally this could be user-defined, but for now the information of where that path is would be enough to use a mount.

Using the AppImage, I located these directories on my system, but couldn't find any warcs or wacz files inside?

despens@slice:~/.config$ ls -l | grep 'page'
drwx------  3 despens despens     3 Mär 26 07:56 archiveweb.page
drwx------ 15 despens despens    23 Apr 10 08:50 archivewebpage
drwx------  5 despens despens     8 Apr 10 08:48 ArchiveWeb.page
despens commented 2 years ago

Doing some file system monitoring while capturing a bigger resource on my Linux system, I found the data is stored under

~/.config/archivewebpage/IndexedDB

The storage format is opaque, so no warc, cdx, or wacz to be found, but what seems like a mixture of binary data and JSON, probably a format mandated by electron.

I have not tried yet to mount a larger drive to this path.

anisa-hawes commented 2 years ago

Thank you, @despens! On mac, this is found at ~/Library/Application Support/archivewebpage/IndexedDB/.

I agree it would be useful to be able to define where the data is saved on our systems (especially for large files, which we might want to save on an external hard drive).

It would be super to be able to identify and access the .warc or .wacz files stored there, so we could back-up specific collections.

Shrinks99 commented 7 months ago

Hijacking this issue now that the question has been answered to create a documentation issue. This question has also been asked by users on the forum as well, would be a good thing to document!

Shrinks99 commented 5 months ago

Adding another note for where the Chrome extension saves the database that the app uses, see my forum post on the issue.