tboenig / page2page

This repository save the stylesheet and workaround for transforming the properitary PAGE XML file from Transkribus (https://transkribus.eu/Transkribus) into a PAGE XML valid format (https://www.primaresearch.org/schema/PAGE/gts/pagecontent/ newest version from 2019-07-16
3 stars 2 forks source link

remove Tag, Property and Link, or transform adequately #2

Open bertsky opened 2 years ago

bertsky commented 2 years ago

All elements of the structural hierarchy contain an (arbitrary long) sequence of Tag, Property and Link elements in Transkribus. These are invalid under the namespace and original schema.

On first glance, I believe we could transform these into: