RockefellerArchiveCenter / rac-data-model

Working repo for Project Electron data model.
2 stars 0 forks source link

Determine final rules for Dates object requirements #35

Closed p-galligan closed 4 years ago

p-galligan commented 5 years ago

We want to require begin dates when possible, but ArchivesSpace doesn't require begin dates on date objects.

Should we allow "null" dates for begin dates? Or should we parse date expressions to begin dates?

helrond commented 5 years ago

I think there are three options:

It may also be worth considering where we do this parsing and manipulation. Part of me thinks we should do it at the source (aka ArchivesSpace) since a) there are some tools that do this already and b) we'd only have to do it once (instead of every time data was transformed).

I know @bonniegee has gotten into this with the digitization dates, so tagging her in in case she has advice/relevant code snippets.

bonniegee commented 5 years ago

I have parsed date expressions to get end years, but it is not very elegant and allows for a fair amount of error. (Which was fine in the case I was using it for, but would need to be refined for this use.)

When a structured date doesn't exist AND the date expression was not "undated," I parsed the date expression from the end until I got a string of four digits. Relevant code is here. I also ran the find_year() function over titles and certain notes (like scope and content) because that was uncovering information, but that would be more relevant to data cleanup than to this.

Also, that linked script gets the date of an ancestor (parsed as YYYY) if there isn't satisfactory date information for a component.

p-galligan commented 4 years ago

Data being fixed in AS before transformation. Begin dates will be required.