NHMDenmark / DaSSCo-Tranche-1-work

DaSSCo Tranche 1 work
0 stars 1 forks source link

Asset Registry System (ARS): Development Phase 3, Work Package 5A #35

Closed PipBrewer closed 2 months ago

PipBrewer commented 2 months ago

From finalised SKI agreement (Customer Mission Statement) for WP5 as a whole:

Work Package 5: Extension of UI

Work Package Overview This work package elaborates on the existing functionality of the UI of the ARS. It outlines the interface to track progress, obtain an overview of statistics and obtain reports of the data assimilation process of DaSSCo. It should also provide an interface for performing data quality checks and enhance the user experience for managing access to assets. For example: Number of specimens that were digitised by a digitiser, the overall number of specimens digitised, etcetera. It should also be able to visualise errors or failures occurring in automation pipelines, synchronisation issues or a list of specimens that need auditing.

Description It is our current understanding, that the communication between UI and ARS is through an API and we would like access to that API including documentation of it.

Requirements to the UI: • Provides role-based access to view and edit assets and the metadata status of individual assets. • Provide an interface for performing quality assurance checks on a subset of the images, persist the results, and mark as checked. This will provide project assurance. • Show failures that occurred while syncing the asset with storage, specify or other similar events, e.g. number of assets that were sent to long-term storage and have been removed from ARS. • Present an overview of the current capacity of the file proxy and the state of assets in use, an overview of the assets that are receiving active updates or are in progress (uploading/downloading). E.g. it could show all the assets in file proxy along with system health to evaluate if the need for adjustments.
• Enable adjustments of assets in file proxy like removing a lingering asset from file-proxy or assign, or increase/decrease file of different allotted shares. • Enable search of assets based on metadata like updated_by, audit_date, etc, select them, and create links to them for an external person who has signed up. This requirement has overlap with WP4. • Enable an easy querying interface for filtering assets on metadata fields and retrieve a list of results that can be exported to CSV. Alternatively, click on the results to view more details or go to the asset and view it. It should also be able to save the search and/or be able to save groups of assets, to come back to at a later date. This will be helpful when checking through problem issues, or ones for quality checking, which may take place over several days (or longer). • Enable the search for records without an image or vice versa and then investigate and resolve the issue. • Obtain a graphical overview of statistics, e.g. how rates have changed over time/per digitizer/per pipeline, etc. It should be possible to select various options from the graph menu, like time frame, workstation, etc. It should also be possible to download graphs as images and their data in CSV format. This would help in monitoring progress, managing, and optimizing work as well as reporting on it. For example, it should be able to provide an interface to find out how many images, and their type (e.g., tiff, jpeg) have been uploaded in the last year. It should also provide an approximate number of images per specimen, in a graphical representation for a particular collection or for a particular asset type like a CT scan. This will help with reporting, as well as monitoring that pipelines are working, evaluating the number and impact of multi-image specimens (e.g., on digitisation rates), how it affects total storage capacity and using the data to make predictions for future (e.g., requirements for future storage). This data should be retrievable as a bespoke search in the UI. • The graphical display of rates should be easy to zoom and focus but not be too sensitive and also act within limitations, as not to botch the view. It should be easy to reset the view to default zoom. • The graphical display of rates should not show minus figures. • It should be possible to easily refresh the graphical display of rates, health, and stats and to ensure that they are refreshed regularly and automatically so that it is the most up-to-date information. • It should be possible to view images and if the quality of these images is not satisfactory, examine the metadata to compare and look for commonality to figure out the issue (e.g., is it the equipment, digitizer, wrong derivative attached etc.). • It should be possible to download a large number of images onto an external hard drive and given to a researcher as a result of a request. • It should be possible to give access to a large number of special assets (high-resolution tiff images, CT scans etc.) at the same time to an external user. • It should be possible to update the metadata associated with a large number of images (e.g. the name of the collection has changed). • It should be possible to assign roles to the functionality and provide a time-based access for certain functionality, for example renaming meta data for several assets, etc. • It is also possible to do user management in the UI. This could be done via IDP KeyCloak and the user should also be able to sign up for the system with basic access and request further access via the system. This might be handled by receiving notifications on assets, users, tagged records, etc.

At the end of this package, the UI would enable management of users, provide search capabilities, provide an interface to update assets and their metadata, as well as download and delete assets. Some of functionality described here overlaps with previous work packages and the specifics will be discussed iteratively and the order of iterations can be decided for the package during the consultation period.

Estimated consultancy hours: 530 hours Estimated internal hours (primarily BS and PB): 110 hours Estimated start date: 01/04/2025 Estimated end date: 30/06/2025

Extended requirements: Github board: https://github.com/orgs/NHMDenmark/projects/32