CopticScriptorium / corpora

Public repository for Coptic SCRIPTORIUM Corpora Releases
31 stars 13 forks source link

Update verse and chapter to verse_n, chapter_n #42

Closed amir-zeldes closed 4 years ago

amir-zeldes commented 4 years ago

Some documents have legacy annotation names chapter, verse: these should be changed and the documentation in the wiki should be updated.

Documents with chapter: https://corpling.uis.georgetown.edu/annis/scriptorium#_q=Y2hhcHRlcg&ql=aql&_c=YXBvcGh0aGVnbWF0YS5wYXRydW0sYmVzYS5sZXR0ZXJzLGRvYy5wYXB5cmksZG9ybWl0aW9uLmpvaG4sam9oYW5uZXMuY2Fub25zLGxpZmUuY3lydXMsbGlmZS5sb25naW51cy5sdWNpdXMsbGlmZS5vbm5vcGhyaXVzLG1hcnR5cmRvbS52aWN0b3IscHJvY2x1cy5ob21pbGllcyxwc2V1ZG8uZXBocmVtLHBzZXVkby50aGVvcGhpbHVzLHNhaGlkaWNhLjFjb3JpbnRoaWFucyxzYWhpZGljYS5tYXJrLHNoZW5vdXRlLmEyMixzaGVub3V0ZS5hYnJhaGFtLHNoZW5vdXRlLmRpcnQsc2hlbm91dGUuZWFnZXJuZXNzLHNoZW5vdXRlLmZveA&cl=5&cr=5&s=0&l=10&_seg=Ym05eWJWOW5jbTkxY0E

Documents with verse: https://corpling.uis.georgetown.edu/annis/scriptorium#_q=dmVyc2U&ql=aql&_c=YXBvcGh0aGVnbWF0YS5wYXRydW0sYmVzYS5sZXR0ZXJzLGRvYy5wYXB5cmksZG9ybWl0aW9uLmpvaG4sam9oYW5uZXMuY2Fub25zLGxpZmUuY3lydXMsbGlmZS5sb25naW51cy5sdWNpdXMsbGlmZS5vbm5vcGhyaXVzLG1hcnR5cmRvbS52aWN0b3IscHJvY2x1cy5ob21pbGllcyxwc2V1ZG8uZXBocmVtLHBzZXVkby50aGVvcGhpbHVzLHNhaGlkaWNhLjFjb3JpbnRoaWFucyxzYWhpZGljYS5tYXJrLHNoZW5vdXRlLmEyMixzaGVub3V0ZS5hYnJhaGFtLHNoZW5vdXRlLmRpcnQsc2hlbm91dGUuZWFnZXJuZXNzLHNoZW5vdXRlLmZveA&cl=5&cr=5&s=0&l=10&_seg=Ym05eWJWOW5jbTkxY0E

lancealanmartin commented 4 years ago

I have updated 'chapter' and 'verse' to 'chapter_n' and 'verse_n' for all relevant documents.

I noticed a few inconsistencies in the Corinthians docs @ctschroeder and @amir-zeldes. Chapters 1-10 have the word chapter in their titles, e.g., '1 Corinthians Chapter 7' Chapters 11-16 use the following model: '1 Corinthians 11'. Which of these should it be?

In the spreadsheet, the biblical chapters 13-16 have a chapter_n column but 1-12 do not. Should I add chapter_n columns for all docs?

amir-zeldes commented 4 years ago

Thanks for doing this @lancealanmartin - the Bible data is a little different: since every document is exactly one chapter, I think we don't really need a chapter_n span - what do you think @ctschroeder ? For the title I think without 'chapter' looks better, but I don't feel strongly about it.

ctschroeder commented 4 years ago

Thank you @lancealanmartin. Thank you for asking @amir-zeldes.

In the titles, I also probably prefer 1 Corinthians 1 (no chapter).

The reason for the chapter span would be to enable search, so I suggest we put in the spans. I know it sounds odd, but I feel like there might be a situation where someone might put that in a search and not know we don’t have the span.

amir-zeldes commented 4 years ago

That's true - the Bible data does already have a chapter metadatum in each document, but if users expect the span annotation to be there for all data, they might expect it there as well.

ctschroeder commented 4 years ago

Hi. How is this going?

lancealanmartin commented 4 years ago

It is finished. All docs should have chapter_n and verse_n.