impactcentre / ocrevalUAtion

OCR evaluation brought to you by University of Alicante
Apache License 2.0
67 stars 27 forks source link

Hauptklasse eu.digitisation.Main konnte nicht gefunden oder geladen werden #29

Open imlabormitlea-code opened 1 year ago

imlabormitlea-code commented 1 year ago

Hei! I tried to run something like

java -cp ocrevaluation.jar eu.digitisation.Main \
    -gt {ground_truth_file} [{encoding}] \
    -ocr {ocr_file} [encoding] \
    -d {output_directory} [-r {equivalences_file}] 

from the directory where I downloaded ocrevalUAtion-1.3.4-jar-with-dependencies.jar to and got the following error: "Fehler: Hauptklasse eu.digitisation.Main konnte nicht gefunden oder geladen werden Ursache: java.lang.ClassNotFoundException: eu.digitisation.Main" Could you help me with that?

cneud commented 1 year ago

Hi @imlabormitlea-code, can you try with java -cp ocrevalUAtion-1.3.4-jar-with-dependencies.jar eu.digitisation.Main ... (or rename ocrevalUAtion-1.3.4-jar-with-dependencies.jar to ocrevaluation.jar and check if this fixed the problem?

imlabormitlea-code commented 1 year ago

Yes renaming helped. Thx a lot. I also had to made a few changes to the provided example from the wiki:

java -cp ocrevaluation.jar eu.digitisation.Main \
    -gt groundtruth.xml -ocr ocr.txt **-e** utf8 \
    **-o** output -r equivalences.csv
imlabormitlea-code commented 1 year ago

Nevertheless I get a bug for ALTO-XML

Dez. 06, 2022 5:07:58 PM eu.digitisation.utils.log.Messages info
INFORMATION: http://www.loc.gov/standards/alto/alto-v2.0.xsd http://schema.ccs-gmbh.com/ALTO http://www.loc.gov/standards/alto/v2/alto-2-0.xsd http://www.loc.gov/standards/alto/ns-v2#
Dez. 06, 2022 5:07:58 PM eu.digitisation.utils.log.Messages info
INFORMATION: eu.digitisation.Main: Unknown schema location http://www.loc.gov/standards/alto/ns-v4# https://www.loc.gov/standards/alto/v4/alto.xsd for file type ALTO
kba commented 1 year ago

We need a new release to include https://github.com/impactcentre/ocrevalUAtion/pull/25 which fixes that.

IIRC workaround was to add

  <entry key="schemaLocation.ALTO">
    http://www.loc.gov/standards/alto/alto-v2.0.xsd 
    http://schema.ccs-gmbh.com/ALTO 
    http://www.loc.gov/standards/alto/v2/alto-2-0.xsd 
    http://www.loc.gov/standards/alto/ns-v2#
    http://www.loc.gov/standards/alto/v3/alto-3-0.xsd
    http://www.loc.gov/standards/alto/ns-v3#
    http://www.loc.gov/standards/alto/v4/alto-4-0.xsd
    http://www.loc.gov/standards/alto/v4/alto-4-1.xsd
    http://www.loc.gov/standards/alto/v4/alto-4-2.xsd
    http://www.loc.gov/standards/alto/ns-v4#
  </entry>

to the userProperties.xml.

cneud commented 1 year ago

I will try to make a new release that includes ALTO v4 via https://github.com/impactcentre/ocrevalUAtion/pull/25 next week.