jhu-idc / idc-isle-dc

Islandora Digital Collections (IDC) development environment
MIT License
2 stars 6 forks source link

View for JScholarship Redirects #286

Closed little9 closed 2 years ago

little9 commented 2 years ago

This adds a View for exporting a file that is similar to the format used by Apache's RewriteMap.

To access the endpoint for this view:

curl -X GET https://test.digital.library.jhu.edu/jscholarship_redirects_rest\?query=\(ss_type\:islandora_object\)\&items_per_page\=500\&offset\=0 > out.csv

That will only get the first 500 items. You'll want to use the offset parameter to go through until you have exported metadata for all the items in the repo.

This would preferably be done in a script.

The resulting file will look like:

dspace_identifier "Citable URL"
https://jscholarship.library.jhu.edu/handle/1774.2/57496 https://test.digital.library.jhu.edu/node/5321

To use the resulting file as a RewriteMap the headers should be removed and the root URL should be removed from the jscholarship URLS:

/handle/1774.2/57496 https://test.digital.library.jhu.edu/node/5321

Then in the JScholarship Apache config (after moving the file to /etc/httpd/conf/redirects.txt):

  RewriteEngine On
  RewriteMap redirects "txt:/etc/httpd/conf/redirects.txt"
  RewriteCond ${redirects:$1} !=""
  RewriteRule ^(.*)$ ${redirects:$1} [redirect=permanent,last]
github-actions[bot] commented 2 years ago

This PR has no dependency differences with the base branch

bseeger commented 2 years ago

When I ran the above curl statement against the cloud test instance I get records like this (with spaces as separator):

dspace_identifier "Citable URL"
https://jscholarship.library.jhu.edu/handle/1774.2/57496 https://test.digital.library.jhu.edu/node/5321
https://jscholarship.library.jhu.edu/handle/1774.2/55393 https://test.digital.library.jhu.edu/node/6472
https://jscholarship.library.jhu.edu/handle/1774.2/55389 https://test.digital.library.jhu.edu/node/6473
https://jscholarship.library.jhu.edu/handle/1774.2/49491 https://test.digital.library.jhu.edu/node/6474
https://jscholarship.library.jhu.edu/handle/1774.2/55376 https://test.digital.library.jhu.edu/node/6475
https://jscholarship.library.jhu.edu/handle/1774.2/49493 https://test.digital.library.jhu.edu/node/6476
https://jscholarship.library.jhu.edu/handle/1774.2/49495 https://test.digital.library.jhu.edu/node/6477
https://jscholarship.library.jhu.edu/handle/1774.2/49494 https://test.digital.library.jhu.edu/node/6478
https://jscholarship.library.jhu.edu/handle/1774.2/55387 https://test.digital.library.jhu.edu/node/6479

Running against my local setup with this PR and some test data I get this (with commas as separator):

dspace_identifier,"Citable URL"
,https://islandora-idc.traefik.me/node/45
,https://islandora-idc.traefik.me/node/44
http://jscholarship.library.jhu.edu,https://islandora-idc.traefik.me/node/38
http://jscholarship.library.jhu.edu,https://islandora-idc.traefik.me/node/39
http://jscholarship.library.jhu.edu,https://islandora-idc.traefik.me/node/40
http://jscholarship.library.jhu.edu,https://islandora-idc.traefik.me/node/41
http://jscholarship.library.jhu.edu,https://islandora-idc.traefik.me/node/42
little9 commented 2 years ago

@bseeger Thanks for catching the separator! I also added a filter for the DSpace identifier.

github-actions[bot] commented 2 years ago

This PR has no dependency differences with the base branch

bseeger commented 2 years ago

I can review this today again. Note that the test failures are due to a testcafe upgrade and John created a PR to lock in a working version yesterday. https://github.com/jhu-idc/idc-isle-dc/pull/291

github-actions[bot] commented 2 years ago

Dependency diff between development base branch development (5c97a1b4435c359c1cf1548e4102bab67a0b7efd) and PR branch jscholarship-redirects (f6cf8f3092cb331e1c0925178792becf67c2d214):

diff --git a/deps.5c97a1b4435c359c1cf1548e4102bab67a0b7efd b/deps.pr-286.f6cf8f3092cb331e1c0925178792becf67c2d214
index 73121b1..2c086ca 100644
--- a/deps.5c97a1b4435c359c1cf1548e4102bab67a0b7efd
+++ b/deps.pr-286.f6cf8f3092cb331e1c0925178792becf67c2d214
@@ -135,7 +135,7 @@ islandora/jsonld                                          dev-8.x-1.x dfd99c4
 islandora/openseadragon                                   dev-8.x-1.x 5f847e5
 jean85/pretty-package-versions                            1.6.0              
 jhu-idc/controlled_access_terms                           dev-8.x-1.x 7a03834
-jhu-idc/idc-ui-theme                                      dev-main 278d742   
+jhu-idc/idc-ui-theme                                      dev-main 0c32304   
 jhu-idc/idc_defaults                                      dev-main cb033a6   
 jhu-idc/idc_export                                        dev-main 33c3b31   
 jhu-idc/idc_ui_module                                     dev-main c353d1e   
github-actions[bot] commented 2 years ago

This PR has no dependency differences with the base branch

bseeger commented 2 years ago

This probably resolves: https://github.com/jhu-idc/iDC-general/issues/450