sul-dlss / was-pywb

Configuration for Stanford's pywb instance
https://swap.stanford.edu
Other
2 stars 0 forks source link

Changing seed objects access rights to dark #10

Open edsu opened 2 years ago

edsu commented 2 years ago

When the access rights for a Seed Object change to Dark a given URL should no longer be accessible via swap.stanford.edu. If the rights are made World then any restriction for a given URL should be removed. Rights changes associated with Crawl Objects are considered separately from this issue (see #27).

pywb has embargo and access controls that allow it to be configured to block access based on the URL itself

The controls are stored in an ACLJ file. The wb-manage command line tool can be used to modify ACLJ files:

$ wb-manager acl -h
usage: wb-manager acl [-h] {add,remove,list,validate,match,importtxt} ...

positional arguments:
  {add,remove,list,validate,match,importtxt}

options:
  -h, --help            show this help message and exit

The current thinking on this is to write a daemon in Ruby, similar to the rolling_index in dor_indexing_app, which listens to RabbitMQ for objects that are changing, and inspects them to see if they are Seed Objects. On finding a rights change to a seed object the daemon will issue the corresponding rights change.

This daemon can be part of the was-pywb repository, and should get deployed and started by Capistrano.

lwrubel commented 2 years ago

As noted in https://github.com/sul-dlss/was-pywb/issues/5 there does not appear to be an pywb endpoint for these updates and other IIPC users are doing manual configuration changes.

edsu commented 2 years ago

I spoke with @ikreymer who recommended against Option 1 given the potential complexity of the change. He suggested Option 2 might be feasible, especially since the service could be a small wrapper around the wb-manager system calls, and could potentially be written in Ruby. Option 3 is still out there, and is really just a question of how feasible it is to add YAML editing on a share into how rights are managed elsewhere in SDR.

edsu commented 2 years ago

I've updated the description to better reflect that we are only talking about changes to Seed Objects in this issue.

edsu commented 2 years ago

I've updated the description in this issue to reflect a conversation with @jcoyne about a possible architectural design for this. The current thinking is we will use RabbitMQ to listen for changes to seed objects, and make the relevant ACLJ updates.

lwrubel commented 2 years ago

The focus of this issue is to be able to make World rights Dark via Argo and vice versa, getting the aclj file updated. Adding Stanford-only visibility is not part of this work.

edsu commented 2 years ago

Updated this issue to focus on the binary World / Dark change. Stanford only access rights changes has been moved to #46

edsu commented 2 years ago

@lwrubel and I had a quick conversation w/ @justinlittman about this and he indicated that we would want to gauge how often we expect changes like this being made before we create automation around it. The thought being that a new daemon service that isn't used very often could fail in hard to diagnose ways. Are there obvious downsides to continuing to do the manual exclusions? These are largely questions for @peterchanws & @andrewjbtw to consider before we move forward with implementing this.