Add "Date Accessioned" field to Admin facets under Publication State

geobtaa / geoblacklight_admin

MIT License

4 stars 2 forks source link

Add "Date Accessioned" field to Admin facets under Publication State #32

Closed karenmajewicz closed 7 months ago

karenmajewicz commented 7 months ago

When we reharvest from data portals, we create a fresh list of current datasets. We then upload them to GBL Admin and publish them.

Then, we then need to isolate the the outdated records and unpublish them. To do so, we check the "Date Accessioned" field. It would be easier to find and unpublish them if the Date Accessioned field were a facet right under the Publication State.

ewlarson commented 7 months ago

We want to make faceting configurable.

ewlarson commented 7 months ago

Okay... this is a lot harder than originally imagined. The values in that field are not normalized.

Examples:

["20230929"]
["2018-12-07"]
["8/31/20"]
[""]
[]
["2019", "2021-04-27", "2022-10-27"]

Screenshot Screenshot 2023-12-05 at 1 20 36 PM

Unfortunately, the field is just a multi-valued string, with no sanity checking/control.

ewlarson commented 7 months ago

Some steps to move ahead:

Write a database migration to cleanse the data already captured
Change this element's type to a date
Use a placeholder / inputmask solution to guide the right data on entry (YYYY-MM-DD)
Add a validation step for the data (ensure it's a proper date)

karenmajewicz commented 7 months ago

I normalized the records so that they are all single values of yyyy-mm-dd or blank. Our newer harvesting scripts automatically insert these values, so most of the records were already in that format. https://geo.btaa.org/admin/blazer/queries/43-accessioned-dates

Another thing to consider here is the idea we had to automatically tag imports with some kind of generated accession code. That is main purpose of these dates - to be able to differentiate between successive uploads.

karenmajewicz commented 7 months ago

The various date/tracking fields we use:

Internal Metadata:

created_at: Generated by GBL Admin and tracking the 1st time an ID was created/uploaded
updated_at: Generated by GBL Admin and tracking whenever a record was changed, either through CSV upload or manual edit.

Kithe Model:

b1g_dateAccessioned_sm: Part of the B1G profile of Aardvark for keeping track of harvest cycles. This reflects batches/uploads of a harvest/reharvest. It may or may not match created_at or updated_at.

karenmajewicz commented 7 months ago

https://github.com/geobtaa/geoportal/issues/507

karenmajewicz commented 7 months ago

Adding to Advanced Search: https://github.com/geobtaa/geoportal/issues/552