kba / transkribus-to-prima

Convert Transkribus PAGE-XML to standard PAGE-XML
11 stars 2 forks source link

Fixes #1, #2, #3, #4, #6 and #8 #12

Closed kba closed 2 years ago

kba commented 2 years ago

This is a replacement for #10 which got closed due to rewriting the Git history:

CLI: delegate to click.File
CLI: default to all fixers
TableCell: convert @Label, too
TableCell/Region: preserve all std attributes and elements
add option for validation of output
fix TextEquiv/UnicodeAlternative
delegate to TranskribusFixer class for choices available, add docstrings
add fixer for Page/@image* transformation attributes
add fixer removing Tag, Property and Link elements
add option to promote TranskribusMetadata/@imgurl to @imageFilename

@bertsky LGTM but please check I didn't mess anything up when redoing the PR.

bertsky commented 2 years ago

Yes, it is the same code-wise, but I'd prefer to keep the commits separate – see #13.

kba commented 2 years ago

Superseded by #13