qurator-spk / sbb_binarization

Document Image Binarization
Apache License 2.0
72 stars 14 forks source link

ocrd-sbb-binarize does not produce a correct AlternativeImage #8

Closed mikegerber closed 3 years ago

mikegerber commented 3 years ago

This is a result I get:

<?xml version="1.0" encoding="UTF-8"?>
<pc:PcGts xmlns:pc="http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15 http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15/pagecontent.xsd" pcGtsId="OCR-D-IMG-BIN_00000024">
    <pc:Metadata>
        <pc:Creator>OCR-D/core 2.18.1</pc:Creator>
        <pc:Created>2020-10-22T18:49:34.729618</pc:Created>
        <pc:LastChange>2020-10-22T18:49:34.729618</pc:LastChange>
        <pc:MetadataItem type="processingStep" name="preprocessing/optimization/binarization" value="ocrd-sbb-binarize">
            <pc:Labels externalModel="ocrd-tool" externalId="parameters">
                <pc:Label value="/var/lib/sbb_binarization" type="model"/>
                <pc:Label value="page" type="operation_level"/>
            </pc:Labels>
        </pc:MetadataItem>
    </pc:Metadata>
    <pc:Page imageFilename="OCR-D-IMG/OCR-D-IMG_00000024.tif" imageWidth="2463" imageHeight="4060">
        <pc:AlternativeImage filename="OCR-D-IMG-BIN/OCR-D-IMG-BIN_00000024.IMG-BIN.png"/>
    </pc:Page>
</pc:PcGts>

I believe the AlternativeImage is missing comments="binarized".

Workspace: actevedef_718448162.first-page.MISSING-ALTERNATIVE-IMAGE-COMMENTS.zip

cneud commented 3 years ago

I believe the AlternativeImage is missing comments="binarized"

Correct, well spotted!