for PAGE's Metadata/Created: a separate Description/Processing element with processingCategory=contentGeneration and the respective processingDateTime (independent of the step_alto entries for each Metadata/MetadataItem)
for PAGE's Metadata/LastChange: a separate Description/Processing element with processingCategory=contentModification and the respective processingDateTime (independent of the step_alto entries for each Metadata/MetadataItem)
For ALTO v2 with its preProcessingStep|ocrProcessingStep|postProcessingStep distinction, one would probably have to map to:
for PAGE's Metadata/Created: a separate Description/OCRProcessing element with ocrProcessingType=preProcessingStep and the respective processingDateTime (independent of the step_alto entries for each Metadata/MetadataItem)
for PAGE's Metadata/LastChange: a separate Description/OCRProcessing element with ocrProcessingType=postProcessingStep and the respective processingDateTime (independent of the step_alto entries for each Metadata/MetadataItem)
But obviously, this is not ideal. However, since PAGE's Created/LastChange does not have a clear semantics, I would argue this is the best pragmatic fit.
BTW, we are also still missing Metadata/Creator! IMO this should go into the contentGeneration (or preProcessingStep) entry.
Metadata/Created
: a separateDescription/Processing
element withprocessingCategory=contentGeneration
and the respectiveprocessingDateTime
(independent of thestep_alto
entries for eachMetadata/MetadataItem
)Metadata/LastChange
: a separateDescription/Processing
element withprocessingCategory=contentModification
and the respectiveprocessingDateTime
(independent of thestep_alto
entries for eachMetadata/MetadataItem
)For ALTO v2 with its
preProcessingStep|ocrProcessingStep|postProcessingStep
distinction, one would probably have to map to:Metadata/Created
: a separateDescription/OCRProcessing
element withocrProcessingType=preProcessingStep
and the respectiveprocessingDateTime
(independent of thestep_alto
entries for eachMetadata/MetadataItem
)Metadata/LastChange
: a separateDescription/OCRProcessing
element withocrProcessingType=postProcessingStep
and the respectiveprocessingDateTime
(independent of thestep_alto
entries for eachMetadata/MetadataItem
)But obviously, this is not ideal. However, since PAGE's Created/LastChange does not have a clear semantics, I would argue this is the best pragmatic fit.
BTW, we are also still missing
Metadata/Creator
! IMO this should go into thecontentGeneration
(orpreProcessingStep
) entry.Originally posted by @bertsky in https://github.com/kba/page-to-alto/issues/37#issuecomment-1888867562