Canon Component Enum, featuring DAN/DAG and EST/ESG

mvahowe commented 4 years ago

Right now we have something like the list at the end of this post. This list was built by me from an XML file that hides within Paratext. I made up these labels because there were none in the file, expecting someone to comment on them, which never happened.

So we could run with this list for now, but I think we do at least need to sort out the DAN/DAG and EST/ESG mess. Essentially, that mess is

EST used to be Esther, regardless of the canon
Then someone decided to add ESG for "Greek Esther" (mixed Hebrew and Greek OT books, typically used by Catholic and Orthodox churches)
But old projects never got upgraded, so they may have Greek content in EST
And some translators don't like ESG because they think it's perjorative, so they use EST deliberately
And some projects have both (for example to produce both a Protestant and Catholic publication)
And all this goes for DAN/DAG too.

And the result is that we have canon components which are only different because of the name of the file for Esther and/or Daniel. (One of the variants in each case ends with "2" in the list below). In SB, filenames are not supposed to matter anyway. But, for the "both EST and ESG" scenario, I think we need another way to link to the correct information in the versification file. Any ideas on how to do this? (I suspect that EST/ESG and DAN/DAG are the tip of the iceberg here, because the USFM for different church traditions could diverge in other ways. So I think we should aim for a generic way to say "this versification for this book".)

canonComponentEnum =
    "armenianApostolicDC" |
    "armenianApostolicOT" |
    "armenianApostolicOT2" |
    "armenianClassicalOT" |
    "armenianNT" |
    "catholicAndAnglicanDC" |
    "catholicLxxDC" |
    "catholicLxxOT" |
    "catholicLxxSeparatedDC" |
    "catholicPlusLutheranDC" |
    "catholicVulgateDC" |
    "catholicVulgateOT" |
    "catholicVulgateSeparatedDC" |
    "czechKralickaDC" |
    "danishLutheranDC" |
    "ethiopianOrthodoxDC" |
    "ethiopianOrthodoxNT" |
    "ethiopianOrthodoxOT" |
    "ethiopianProtestantNT" |
    "ethiopianProtestantOT" |
    "georgianOrthodoxDC" |
    "georgianOrthodoxOT" |
    "georgianOrthodoxOT2" |
    "georgianSynodalDC" |
    "germanLutheranDC" |
    "greekOrthodoxDC" |
    "greekOrthodoxOT" |
    "kjvDC" |
    "kjvNonDC" |
    "lutheranNT" |
    "romanianOrthodoxDC" |
    "romanianOrthodoxOT" |
    "russianNT" |
    "russianOrthodoxDC" |
    "russianOrthodoxOT" |
    "russianProtestantOT" |
    "russianSynodalDC" |
    "syriacNT" |
    "syriacOT" |
    "tanakhOT" |
    "turkishInterconfessionalDC" |
    "vulgateCatholicBible" |
    "westernInterconfessionalDC" |
    "westernInterconfessionalDC2" |
    "westernNT" |
    "westernOT" |
    xsd:string { pattern = "X-[A-Za-z0-9][A-Za-z0-9\-]*" }

mvahowe commented 4 years ago

This is (probably) about to become a pain point, as I implement canonSpec logic. ie PT has canons that are different only by whether they use, say, DAN or DAG, and we have no idea how anyone has used DAN. To anyone not immersed in the history of PT this is hopelessly confusing.

jag3773 commented 3 years ago

A possible way forward would be to use whatever role name we want in SB metadata but still link the existing USFM/USX files/names as they currently are.

jonathanrobie commented 3 years ago

This is not a problem that Scripture Burrito should solve.

When creating a project, users think in terms of whatever canon they use. An Orthodox team might think these portions are simply part of Daniel or Esther. A Protestant team might think the same text belongs in something they call the Apocrypha, placing the same portions in things they think of as "Daniel Greek" or "Esther Greek".

A publisher needs to understand these relationships to use this data intelligently, e.g. they may choose to map the same content differently if they publish Protestant, Orthodox, and Orthodox editions of the same translation. A publisher who encounters ESG and DAG needs to know what that means and where it would appear in the Septuagint text. A publisher who sees that an Orthodox canon is in use needs to understand the ramifications.

Explaining this to the publisher is out of scope for Scripture Burrito.

bible-technology / scripture-burrito

Canon Component Enum, featuring DAN/DAG and EST/ESG #143