pulibrary / princeton_ansible

Ansible Roles and Playbooks for Princeton University Library
10 stars 4 forks source link

Redirect PUDL viewer URLs #4566

Closed escowles closed 3 months ago

escowles commented 9 months ago

We redirected most of the PUDL URLs that contain ARK IDs so the old URLs will get users to the new location of the content. But we just had a report from a user that this URL wasn't working:

https://pudl.princeton.edu/viewer.php?obj=2801pg40q#page/143/mode/2up

we should redirect that pattern:

pudl.princeton.edu/viwer?obj=[X] -> arks.princeton.edu/ark:/88435/[X]

acozine commented 3 months ago

I think this needs to be added to https://github.com/pulibrary/princeton_ansible/blob/main/roles/nginxplus/files/conf/http/templates/pudl_proxy_pass.conf.

acozine commented 3 months ago

We used https://regex101.com/ to test the regex for this redirect.

/^\/viewer.php\?obj=(.*) matches /viewer.php?obj=2801pg40q and returns just the object ID 2801pg40q. However, we need to handle the viewer details (#page/143/mode/2up) in the example above.

We found this post and modified it for the example.

/^\/viewer.php\?obj=(.*)?(?=#) matches both /viewer.php?obj=2801pg40q#page/143/mode/2up and /viewer.php?obj=2801pg40q#. Both examples return just the object ID. It does not match if there is no # after the object ID. We don't know enough about the old PUDL viewer to be sure . . . would it ever have served up URLs without a # after the object ID?

escowles commented 3 months ago

@acozine I don't think we need to handle the part of the URL after the "#" — we have a ticket to support linking to individual pages (https://github.com/pulibrary/figgy/issues/5151) but it is not been implemented. Getting the user to the digital object would be a much better experience, even if they have to navigate to the correct page themselves.