introduce metadata to help future parsing

In our current strategy we focus on plain htmlToDita conversion - largely ignoring the structure of the content.

This gives a useable document, but it could be deferring some technical debt for future parsing of the document set.

We can help this out by adding some metadata.

I can think of these tags:

[x] regions file (world map) - document level id
[x] regions file (world map) - imagemap
[x] region page - document level tag
[x] category page - document level tag
[x] category page - flag element (this is in place already)
[ ] unit page - document level tag
[ ] unit page - images block
[ ] unit page - signatures table
[ ] unit page - propulsion block
[ ] unit page - remarks block
[ ] images pages - document level tag
[ ] transducer page - document level tag

Ian to investigate pattern to detrmine images/transducer pages onsite. Suspect it's if filename contains _grams or _pics, but will need to double-check that. Or, we may need to look at some kind of page content to determine that.

@IanMayo to make sure that anchors and banjo have the correct headers/content to trigger the unit page identifiers.

DeepBlueCLtd / LegacyMan

introduce metadata to help future parsing #582