Copenhagen-Alliance / versification-specification

Versification mappings and versification snifffing
17 stars 6 forks source link

Can the code work with USX from eBible? #18

Open davidbaines opened 2 years ago

davidbaines commented 2 years ago

There is a large corpus of Open licensed Bibles in many languages at https:\ebible.org. It would be a good test of this system if the code can create the custom versification files required in order to align those correctly.

Can the code be used to create an alignment file from this example data? https://ebible.org/Scriptures/engPEV_usfx.zip

jonathanrobie commented 2 years ago

Hi David - yes, that would be a good test. Have you tried running usx2versification.py to create the JSON representation, then json2vrs.py to convert it to a .VRS file?

Or are you asking for documentation? What is it that you need to do this test you propose?

davidbaines commented 2 years ago

I did a test to the best of my ability, running both scripts. The resulting JSON files were nearly empty and only contained what looks like boilerplate with a list of the files that exist in the folder and correspond to a book. Here's an example result from Angave (aak)

{
    "shortname": "aak",
    "maxVerses": {
        "MAT": [],
        "MRK": [],
        "LUK": [],
        "JHN": [],
        "ACT": [],
        "ROM": [],
        "1CO": [],
        "2CO": [],
        "GAL": [],
        "EPH": [],
        "PHP": [],
        "COL": [],
        "1TH": [],
        "2TH": [],
        "1TI": [],
        "2TI": [],
        "TIT": [],
        "PHM": [],
        "HEB": [],
        "JAS": [],
        "1PE": [],
        "2PE": [],
        "1JN": [],
        "2JN": [],
        "3JN": [],
        "JUD": [],
        "REV": []
    },
    "partialVerses": {},
    "verseMappings": {},
    "excludedVerses": {},
    "unexcludedVerses": {}
}

On the webpage for the Angave translation there is a link to "Formats for developers" which reveals the option to download the USFX files

The information below indicates where to find additional test data if required. This csv file has details about all the available translations. Here is an index