Closed jsjiang closed 6 months ago
From Dave:
The long term solution is to set the NAAN registry entries with the correct information. This would mean no changes to EZID.
That said, it would still be beneficial for EZID to implement an API for listing shoulders - actually it would be listing NAANs. This should be done as part of the resolve inflection operation much as the existing functionality that returns shoulders given a NAAN. For example:
https://ezid.cdlib.org/ark:/99999?info
lists the shoulders for the NAAN 99999.
A request like:
https://ezid.cdlib.org/ark:?info
would return a list of ARK NAANs.
The shoulder-list.txt file is not a desirable solution and was only put in place pending updates to the NAAN registry for EZID managed NAANs.
Dave
Currently the manage shoulder-list
operation only saves output to the log file (with the --debug
option). It does not save output to the shoulder-list.txt
file.
Also the log entry format is a little different than the temporary file https://ezid.cdlib.org/static/info/shoulder-list.txt
Temporary file data format:
impl.nog.shoulder INFO shoulder - Shoulders:
impl.nog.shoulder INFO shoulder - ark:/13030/bn ?1?
impl.nog.shoulder INFO shoulder - doi:10.5070/L6 Aleph, UCLA Undergraduate Research Journal for the Humanities and Social Sciences
impl.nog.shoulder INFO shoulder - doi:10.5070/LN4 Alon: Journal for Filipinx American and Diasporic Studies
impl.nog.shoulder INFO shoulder - doi:10.17953/A3 American Indian Culture and Research Journal
impl.nog.shoulder INFO shoulder - doi:10.17953/ American Indian Culture and Research Journal (no minter)
impl.nog.shoulder INFO shoulder - ark:/86073/b3 American University of Beirut
impl.nog.shoulder INFO shoulder - ark:/99999/fk4 ARK Test
impl.nog.shoulder INFO shoulder - ark:/99999/fk8 ARK Test (non-expiring)
impl.nog.shoulder INFO shoulder - doi:10.5070/RJ4 Asian American Research Journal
impl.nog.shoulder INFO shoulder - doi:10.5070/P3 Asian Pacific American Law Journal
impl.nog.shoulder INFO shoulder - doi:10.5070/BK8 Backbone
impl.nog.shoulder INFO shoulder - ark:/85779/j4 Berkeley Law Library
impl.nog.shoulder INFO shoulder - doi:10.15779/J2 Berkeley Law Library
impl.nog.shoulder INFO shoulder - doi:10.15779/Z38 Berkeley Law School Journals
Log entry from manage shoulder-list
:
shoulder.py:81 shoulder impl.nog_sql.shoulder INFO shoulder - Shoulders:
shoulder.py:83 shoulder impl.nog_sql.shoulder INFO shoulder - ark:/13030/bn ?1?
shoulder.py:83 shoulder impl.nog_sql.shoulder INFO shoulder - doi:10.5070/L6 Aleph, UCLA Undergraduate Research Journal for the Humanities and Social Sciences
shoulder.py:83 shoulder impl.nog_sql.shoulder INFO shoulder - doi:10.5070/LN4 Alon: Journal for Filipinx American and Diasporic Studies
shoulder.py:83 shoulder impl.nog_sql.shoulder INFO shoulder - ark:/86073/b3 American University of Beirut
shoulder.py:83 shoulder impl.nog_sql.shoulder INFO shoulder - ark:/99999/fk4 ARK Test
shoulder.py:83 shoulder impl.nog_sql.shoulder INFO shoulder - ark:/10945/ ARK Test (no minter)
shoulder.py:83 shoulder impl.nog_sql.shoulder INFO shoulder - ark:/99999/fk8 ARK Test (non-expiring)
shoulder.py:83 shoulder impl.nog_sql.shoulder INFO shoulder - doi:10.5070/RJ4 Asian American Research Journal
shoulder.py:83 shoulder impl.nog_sql.shoulder INFO shoulder - doi:10.5070/P3 Asian Pacific American Law Journal
shoulder.py:83 shoulder impl.nog_sql.shoulder INFO shoulder - doi:10.5070/BK8 Backbone
shoulder.py:83 shoulder impl.nog_sql.shoulder INFO shoulder - ark:/85779/j4 Berkeley Law Library
shoulder.py:83 shoulder impl.nog_sql.shoulder INFO shoulder - doi:10.15779/J2 Berkeley Law Library
shoulder.py:83 shoulder impl.nog_sql.shoulder INFO shoulder - doi:10.15779/Z38 Berkeley Law School Journals
@datadavev Hi Dave, do you require the shoulder-list.txt
file in the above log format? Will a tsv
file with shoulder and description work?
Jing
All that is needed is a list of NAANs managed by EZID. A natural pattern to follow is to provide the list of prefixes (NAANs) given a scheme (ark:
). If a client wanted more details on a particular NAAN then they could use an inflection request on the NAAN.
So basically:
https://ezid.cdlib.org/ark:?info
-> list of NAAN
https://ezid.cdlib.org/ark:/NAAN?info
-> list of shoulders for NAAN
For the list of NAANs, it can be:
{
"prefixes": [
"86073",
"10945",
...
]
}
/apps/ezid/bin/run-shoulder-list.sh
on ezid-dev/apps/ezid/ezid/static/info/shoulder-list.txt
Cron job entry
42 14 * * 1-7 /ezid/bin/run-shoulder-list.sh
Wrapper script:
#!/bin/bash
#
# wrapper script providing shell environment to run ezid shoulder-list command
# from crontab.
#
# This file is managed by puppet
export PYTHONPATH=$HOME/ezid
export DJANGO_SETTINGS_MODULE=settings.settings
PYENV_ROOT=$HOME/.pyenv
$PYENV_ROOT/shims/django-admin shoulder-list --debug > /apps/ezid/ezid/static/info/shoulder-list.txt
@ashleygould Hi Ashley, Can you add the run-shoulder-list.sh
to the cron on ezid-prd. The script is on ezid-dev in the bin directory: ezid@uc3-ezidui01x2-dev:/apps/ezid/bin/run-shoulder-list.sh
. Please set it up to run everyday at 3:15am.
Let me know if you have questions.
Thank you
Jing
Note: Puppet removes this file when deploying new EZID release. Need to run the run-shoulder-list.sh
script after new code has been deployed.
this is complete. please validate the job runs as expected.
ezid@uc3-ezidui02x2-prd:15:40:10:~$ ll bin/run-shoulder-list.sh
-rwxr-xr-x 1 ezid ezid 491 Dec 15 15:39 bin/run-shoulder-list.sh
ezid@uc3-ezidui02x2-prd:15:40:16:~$ crontab -l
# HEADER: This file was autogenerated at 2023-12-15 15:39:50 -0800 by puppet.
# HEADER: While it can still be managed manually, it is definitely not recommended.
# HEADER: Note particularly that the comments starting with 'Puppet Name' should
# HEADER: not be deleted, as doing so could cause duplicate cron jobs.
# Puppet Name: ezid-rotate_and_compress_logs
59 23 * * 1-7 /bin/date >> /ezid/var/log/uc3_logrotate.log ; /usr/sbin/logrotate --state /ezid/etc/logrotate.status /ezid/etc/logrotate.conf >> /ezid/var/log/uc3_logrotate.log
# Puppet Name: clearsessions
0 0 * * 0 /ezid/bin/clearsessions.sh
# Puppet Name: link_check_emailer
15 3 * * * /ezid/bin/run-shoulder-list.sh
# Puppet Name: link_check_summary_report
0 3 10 * * /ezid/bin/link_check_summary_wrapper.sh
Issue and solution:
currently theshoulder-list.txt
file is generated from a cron job every day at 3:15am. However, it will be deleted when we run the EZID deployment script. Ashley and I looked into a few solutions. Since theshoulder-list.txt
file is a temporary solution, we think the easiest solution for this is to run the /ezid/bin/run-shoulder-list.sh
command manually each time after EZID code deployment.
created new ticket for creating ZEID APIs to list NAAN and shoulders #549
The arks.org site retrieves the
shoulder-list.txt
, a text file containing a list of all the EZID shoulders (DOI or ARK) to ensure its configuration is up to dateThe file was generated by the
manage.py shoulder-list
operation and was placed in ezid/static/info/shoulder-list.txt which could be retrieved from the arks.org service from the URL https://ezid.cdlib.org/static/info/shoulder-list.txt.Something broke with a recent update to EZID, with the
manage.py shoulder-list
operation no longer working and the shoulder-list.txt file no longer being available.Temporary fix: Dave placed a copy of the
shoulder-list.txt
file on ezid-prd instanceShort term solution: I will test the script and ask Ashley to setup a cron job to update the shoulder-list.txt once a week.
Long term solution: