WebCuratorTool / webcurator

The root of the webcurator tool project, containing all modules needed to run a fully functional webcurator tool.
Apache License 2.0
2 stars 1 forks source link

Add option to use soft links to warcs in the QA wayback folder #72

Closed hannakoppelaar closed 1 year ago

hannakoppelaar commented 2 years ago

Since the WaybackIndexer copies files from the store to a wayback input folder, every warc file exists on the filesystem twice, until its parent target instance is archived. For users who have many target instances in a pre-archived state, we could save a significant amount of storage space by giving them the (configuration) option to use soft links to warc files instead of copies. The structure of the wayback input folder would remain unchanged, but the warc file entries would be soft links to files inside the store directory.