Closed kba closed 8 months ago
IMHO the correct representation would have been:
Metadata/Created
: a separate Description/Processing
element with processingCategory=contentGeneration
and the respective processingDateTime
(independent of the step_alto
entries for each Metadata/MetadataItem
)Metadata/LastChange
: a separate Description/Processing
element with processingCategory=contentModification
and the respective processingDateTime
(independent of the step_alto
entries for each Metadata/MetadataItem
)For ALTO v2 with its preProcessingStep|ocrProcessingStep|postProcessingStep
distinction, one would probably have to map to:
Metadata/Created
: a separate Description/OCRProcessing
element with ocrProcessingType=preProcessingStep
and the respective processingDateTime
(independent of the step_alto
entries for each Metadata/MetadataItem
)Metadata/LastChange
: a separate Description/OCRProcessing
element with ocrProcessingType=postProcessingStep
and the respective processingDateTime
(independent of the step_alto
entries for each Metadata/MetadataItem
)But obviously, this is not ideal. However, since PAGE's Created/LastChange does not have a clear semantics, I would argue this is the best pragmatic fit.
BTW, we are also still missing Metadata/Creator
! IMO this should go into the contentGeneration
(or preProcessingStep
) entry.
With this PR, the
alto:processingDateTime
element of analto:processingStep
will be set to either thepc:Created
timestamp (--timestamp-src Created
), thepc:LastChange
timestamp (--timestamp-src LastChange
) or not at all like before (--timestamp-src none
).This is not 100% correct since
Created
andLastChange
are document-wide and not step-specific but we have no other source for them AFAICS and it is important for our (@StaatsbibliothekBerlin) workflows to have at least an approximate date for versioning purposes in thealto:processingStep
s.