new instance for layout analysis

OpenPecha / prodigy-tools

Tools for OpenPecha's use of Prodigy

MIT License

0 stars 1 forks source link

new instance for layout analysis #10

Open eroux opened 1 year ago

eroux commented 1 year ago

We should launch a new instance for layout analysis, which means:

[x] a new systemd service
[x] a new configuration file
[x] a new recipe
[x] a new nginx configuration on the server + a new ssl certificate
[x] a new set of tags that can be associated with an image

I'm not sure what the tags will be @eric86y pointed out https://journal.digitalmedievalist.org/article/id/8073/ which can be a good example.

I think there should be at least:

text area
illustration
margin
header/footer
perhaps string holes or pseudo-string holes would be also useful?

eroux commented 1 year ago

also we should define how illustrations work, perhaps in an annotation manual. Do they tag the illustration + the caption, or do we create 2 different tags? Also, what exactly do we tag as the margin? do we count the frame in the text area? do we tag stamps? Here's an interesting example we should annotate:

https://iiif.bdrc.io/bdr:I1NLM2739_001::I1NLM2739_0010001.jpg/full/max/0/default.jpg

ngawangtrinley commented 1 year ago

Sounds good! We did a bit of testing/research and agreed to start with 5 basic classes like the ones you listed + an "other" class. At the end of next week we'll review what was done and revise the tagset as needed.

Portrait/Book format:

text area
illustration
caption
margin
header
footer
hole
other

Landscape/Pecha format:

text area
illustration
caption
margin
hole
other

eroux commented 1 year ago

also, what images do you want to tag? the entirety of the BDRC images? If so we need to understand how to create a recipe for that (I'm actually not quite sure how to do that... because I guess we don't want to just annotate them in alphabetical order?)

ta4tsering commented 1 year ago

just created a recipe for the layout Analysis and below image is how it looks for now. Screenshot 2023-01-20 at 4 30 14 PM

kaldan007 commented 1 year ago

looks good

eroux commented 1 year ago

for modern book formats, I think better tags would be:

text area margin header footer footnote area other

for Chinese book format we can tag the registers as text areas