ucsdlib / damsmanager

DAMS Manager
Other
3 stars 1 forks source link

Problems with <dams:beginDate> and <dams:endDate> #394

Closed arwenhutt closed 4 years ago

arwenhutt commented 4 years ago

Descriptive summary

When testing release https://github.com/ucsdlib/damsmanager/issues/385 , specifically https://github.com/ucsdlib/damspas/issues/732 and batch editing of a subset of records affected by it, that a batch export of the metadata did not include and . These records were then batch edited, adding the missing field, , and . was added to the records, but begin and end dates were not recorded.

When originally ingested, did include begin and end date, so that data was lost at some point.

This has been replicated but not consistently.

lsitu commented 4 years ago

@arwenhutt Could you attached edited spreadsheet from batch export (the one use for metadata overlay) so that I can take a look? Thank you.

arwenhutt commented 4 years ago

@lsitu I believe this is the file @abbypenn93 used for the overlay: DataMares_unit.xlsx

arwenhutt commented 4 years ago

@lsitu here's some other tests with object and overlay spreadsheets:

Test A2: Editing begin and end date values (failed)

https://library.ucsd.edu/dc/object/bb9459046m spreadsheet used for overlay: OLR_date_overlay_test_A.xlsx

Test A3: Editing begin and end date values (successful)

https://library.ucsd.edu/dc/object/bb74795062 spreadsheet used for overlay: OLR_date_overlay_test_A_again.xlsx

One difference between these two tests is that A2 used Date:collected and A3 used Date:creation

lsitu commented 4 years ago

@arwenhutt I think the problem is that Begin date and End date will only bind to dams:dateCreation but not just any date types like dams:dateCollected. Maybe we should add this rule for validation? Or would you like to change to model to allow other date types to have Begin date and End date as well? In this case, we may need to define a rule for the order of Begin dateandEnd date bindings when there are more than one date type exists.

arwenhutt commented 4 years ago

@lsitu yes, begin and end date should 1) be allowed to bind to any date type or 2) be encoded separately when there is more than one date.

See https://ucsdlibrary.atlassian.net/browse/DM-28?focusedCommentId=31572&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-31572 for historical discussion

lsitu commented 4 years ago

@arwenhutt 2) be encoded separately when there is more than one date. seems to be weird since all dams:Date should have the rdf:value field. Is there any preference in binding of Begin date and End date for a specific dams:Date type in the case with more than one date?

lsitu commented 4 years ago

@arwenhutt Also, if begin and end date is going to be encoded separately, it violates the dams:Date model that need to have the dams:type attribute that is from the date type CV list.

lsitu commented 4 years ago

@arwenhutt Do you have any suggestions for binding Begin date and End date to one of the date when there is more than one date? It seem like separating Begin date and End date doesn't comply with the dams:Date model that requires to have dams:type. In Excel InputStream tool, we will allow the following date type: Date:collected Date:copyright Date:creation Date:issued Date:event

arwenhutt commented 4 years ago

@lsitu sorry for the delay in responding, I've been out of the office with a sick kiddo.

It's not possible to make a rule for binding begin and end date to a specific date type, since the begin and end dates can represent a date of any type.

The approach I described is not a change in practice, it's what was agreed on when the excel ingest tool was developed, and the way it has worked since then. As discussed in the jira ticket/comment I linked above, it's not an ideal approach but it's good enough. All of the objects ingested using the excel ingest tool follow this approach, in some cases the begin and end date will be bound to the creation date, but in many cases they will be bound to another date type or encoded separately. Just a few examples where the begin and end date are not bound to date:creation :

This ticket is just to address the bug identified with existing dates being lost, not to change date encoding practice - which would require a lot work and clean up of existing records.

lsitu commented 4 years ago

@arwenhutt Okay. Then I would just move forward with the separate dams:Date for Begin data and End date when there are more than one date exists.

arwenhutt commented 4 years ago

@lsitu Great, thanks!

lsitu commented 4 years ago

@mcritchlow @arwenhutt I've created PR https://github.com/ucsdlib/damsmanager/pull/400 for it, which is ready for review now. Thanks.

arwenhutt commented 4 years ago

@lsitu Can you give us some information about what conditions cause the problem with dates? also when the relevant code was deployed?

This will help us with figuring out how to test it and give us an idea for how many records may have been affected and may need to be fixed.

lsitu commented 4 years ago

@arwenhutt I think it caused by a bug that selectively binds begin/end date to date:creation. And records ingest/overlay recently with begin/end date but no date:creation should be affected.

arwenhutt commented 4 years ago

@lsitu verified on production ✔️