vu-rdm-tech / yoda-pilot

A place to track issues we run into in the Yoda pilots
0 stars 0 forks source link

Shadow database for administration #43

Closed peer35 closed 2 years ago

peer35 commented 3 years ago

We will need a database to store administrative data of Yoda projects, let's see if we can think of some requirements.

Data sources:

Functionalities:

Data fields, per project:

peer35 commented 3 years ago

preliminary Yoda administration db

Every project has:

To be updated monthly(?):

Tables: person id (PK) *In 10 years a vunetid could be reused?? vunetid firstname lastname orcid created updated

department *These things change so do not try to follow Pure id (PK) name abbreviation faculty institute

budget id (PK) *or use the budget code? code type vunetid created updated

project id (PK) title description research_id owner_id manager_id (interesting to have another name if the owner is unavailable) department_id budget_code request_date delete_date

vault id yoda_name retention (? might be interesting for reporting) created

publish yoda_id vault_id doi created tombstoned

research id category *get from Yoda yoda_name created deleted

category_datamanager *generated from Yoda category vunetid

research_vaults research_id vault_id

vault_stats id vault_id date size

research_stats id research_id date size

peer35 commented 3 years ago

Minimal functionality:

Admin:

It would be nice to have this in a simple web interface for the admin. Build in Django?

Research/management/public:

peer35 commented 3 years ago

I started a new repo https://github.com/vu-rdm-tech/adminyoda

The forms Django admin interface are actually a pretty nice way to add and edit records. Main things to do:

peer35 commented 3 years ago

Current database model: https://github.com/vu-rdm-tech/adminyoda/blob/master/projects/models.py

peer35 commented 3 years ago

Since it looks like you need a rodsadmin account to do the queries on all groups it might be a good idea to investigate how to do implement gathering the stats in an irods rule. Although the question remains whether i will be able to schedule a job to run it.

peer35 commented 3 years ago

The existing rule in Yoda is started via cron, see https://github.com/UtrechtUniversity/yoda/blob/development/roles/yoda_rulesets/tasks/irods-ruleset-uu.yml:

- name: Enable storage statistics gathering cronjob
  become_user: "{{ irods_service_account }}"
  become: yes
  cron:
    name: 'monthly-storage-statistics'
    minute: '0'
    hour: '5'
    day: '1'
    job: '/bin/irule -F /etc/irods/irods-ruleset-uu/tools/monthly-storage-statistics.r >> /var/lib/irods/log/job_monthly-storage-statistics.log 2>&1'
  when: monthly_storage_statistics.stat.exists
peer35 commented 3 years ago

Delay should work: https://docs.irods.org/4.1.7/manual/rule_engine/

theoretically you could repeat every day

bgoli commented 3 years ago

A cron job (or delay) would be best but for an intermediate solution could irule be run via a remote login (e.g. from a monitor machine)?

peer35 commented 3 years ago

The thing is that it needs my personal rodsadmin account. Creating a new rodsadmin "service account" is possible but also not great. I think there are 2 safe options:

  1. Create a cronjob on the irods server that runs the rule as the rods system user. We will need Surf for that, but it could be arranged via ansible to add our own "vu-yoda-ruleset"
  2. Use the Yoda monitoring Surf is building. If we could get a daily data dump from elastic that would work as well.

2 is the best option, until then, yes I can run the irods script on a separate machine.

peer35 commented 3 years ago

Database live at: https://adminyoda.labs.vu.nl

I'll leave this issue open until I have finalized the 1.0 version. After that I'll create a new issue for the rodsadmin issue to be picked up once we know more about the Surf monitoring application.

peer35 commented 2 years ago

Moved to Jira: https://jira.vu.nl/browse/RDA-134

Specific technical issues can be logged here: https://github.com/vu-rdm-tech/adminyoda/issues