Open edsu opened 2 years ago
Would it be possible to create conflicting access where one crawl has a url like https://example.com and it is "world" and another crawl also includes the same url (e.g. https://example.com) and it is "dark"?
Good point, that is definitely possible. pywb's ACLJ file can also include the timestamp associated with the URL to block. So in theory that could be factored in if we decide we really need pywb to respect access rights changes related to Crawl Objects. At the moment there haven't been given any use cases for access rights changes to Crawl Objects. This issue is mostly here to note that it isn't currently being handled.
Iceboxing along with #10.
When the access rights for a Crawl Object are changed in Argo we would like those changes to be respected by pywb so that the content is World, Stanford only or Dark (unavailable). In #10 we address the issue of similar rights changes to Seed Objects. However to make similar changes to sets of WARC files will involve modifications to the CDXJ indexes themselves (to add or remove entries). It may prove difficult to make the contents of a WARC file only available on campus, since these controls operate at the URL level, and a given set of WARC files could contain may URLs at different sites.