Closed baskaufs closed 10 months ago
Useful video series on OCR: https://www.youtube.com/playlist?list=PL2VXyKi-KpYuTAZz__9KVl1jQz74bDG7i
Note: not all images with text are posters. For example, the Daumier prints have captions. We need to find out what the classification types would be in Wikidata and decide how to sort them out. I think right now they all are "print".
We also need to find out the property used to expose the text. "Caption"? "Inscription"?
Basically finished this with Emily's project fall 2023
This is a refinement of #3 and is also related to #23
The category of "prints" includes both art prints (with labels that are generally titles) and posters (with labels that are generally the text on the poster). We can use OCR and compare the results with the labels to confirm whether the print is a poster or not. Some preliminary work has been done on this using Keras OCR.