UB-Mannheim / AustrianNewspapers

NewsEye / READ OCR training dataset from Austrian Newspapers (1864–1911)
15 stars 3 forks source link

Rotated Characters, Typesetting Errors #19

Open wollmers opened 4 years ago

wollmers commented 4 years ago

Just for the records.

There are some rotated characters in steps of 90 degrees. It happens often with n/u. In the binarised images of low quality and Fraktur the difference between n/u and them 180 degrees turned is seldom visible. I give spelling the precedence.

Also in low image quality and Fraktur R/K, B/V, M/W look very similar.

Sometines it seems intended by the typesetter to use long-s turned 180 degrees as separator in Hungarian phone numbers. I transcribe them as '|'.

This one is funny:

Bildschirmfoto 2020-07-15 um 17 32 24

wollmers commented 4 years ago

Mixed fonts:

Bildschirmfoto 2020-07-15 um 21 05 37