Islandora / documentation

Contains islandora's documentation and main issue queue.
MIT License
103 stars 71 forks source link

Cookbook: Multi-Tenancy with Aggregated Search & Browse #1300

Open bondjimbond opened 4 years ago

bondjimbond commented 4 years ago

This configuration will allow multiple Islandora 8 sites to share single installations of Fedora and Solr, replicating the multi-site capabilities of Islandora 7. Each site's repository objects are segregated within Fedora, use their own Solr cores, and are indexed by a "parent" site that can search and display content from all of the child sites together, without duplicating media.

Addresses or relates to the following extant Islandora 8 issues:

Workflow


The Components


Fedora Configuration


Fedora must be configured with separate containers (roots) for each tenant. 

  1. In site setup, when a new site is established, a new basic container is created in Fedora off the root, something like <http://localhost:8080/fcrepo/rest/site1>. This should be added to the playbook.

  2. Prepend that site1 path to all requests to place objects into that root like <http://localhost:8080/fcrepo/rest/site1/lo/ng/ur/lgoeshere>. This could be handled by adjusting the fedora_base_url and the crayfish_fedora_base_url in the playbook.

    • Probable approach: Change the settings.php for flysystem to tell it to look at the new root, and to pass the root into the mintFedoraUrl function in gemini so that the url gets passed to fedora reflects the new root.

Solr Configuration


The Islandora 8 installation playbook must be modified to include a new configuration option to spin up a multi-tenancy implementation. These steps should be added to the playbook, but are also included below for manual implementation.

Create a new Solr core for your new tenant (site1)

  1. cd /var/solr/data

  2. mkdir site1

  3. Copy the conf directory from an existing Solr core to the new directory.

    • E.g. cp -r CLAW/conf site1/conf

Create an aggregate Solr core, following the same steps above (Site_All).

Drupal Configuration


Using the CLAW Playbook, install the Parent/Aggregate site first.

Then modify variables in the CLAW playbook to create separate child site installs according to your needs.

The Parent site is configured to use an aggregated Solr core.

The Child sites are configured to write simultaneously to both their own cores, and to the aggregate core.

Child sites will display only their own content, while the Parent site can display everything.

Individual Tenants

Setup (multisite or individual installs)

Fedora configuration

Solr configuration

Configure Solr to write to both Site1 core and Site_All core:

Security and Access Control: DEVELOPMENT NEEDED

Parent/Aggregate Site

Setup

Security and Access Control

Search configuration

Browse configuration: DEVELOPMENT NEEDED

FUTURE DEVELOPMENT: Manage objects from parent site

A valuable feature of Islandora 7 multisites is that an administrator can edit and otherwise manage objects from within the parent site. We could use the REST API to replicate this functionality.

elizoller commented 4 years ago

^correction to above, you probably wouldn't want to change crayfish_fedora_base_url and rather rely on the minter to handle fedora urls including a root

whikloj commented 4 years ago

My suggestion is that we accept the Fedora base URI as part of the request to Milliner. Then a configuration change on the Drupal side would affect the URIs minted and where (i.e. the URI of Fedora) Milliner pushes to.

bondjimbond commented 4 years ago

@whikloj I don't know what Milliner is, but your suggestion sounds good.

whikloj commented 4 years ago

Milliner is the Crayfish microservice that PUTs RDF into Fedora. https://github.com/Islandora-CLAW/Crayfish/tree/dev/Milliner

bondjimbond commented 4 years ago

Then that sounds excellent. The more configuration we can make part of the Drupal UI, the better.

bondjimbond commented 4 years ago

Another thought on browse mode: we need to sort out a recommended Matomo configuration for tracking the usage of objects across both the individual sites and the aggregator site.

bryjbrown commented 4 years ago

@bondjimbond You could do that by setting up Matomo to send a unique identifier (UUID?) as part of the tracking, and then set up a Matomo custom API where you can request data for that specific unique identifier aggregated from all the sites.