avendesora / pythonbible-parser

A python library for parsing Bible texts in various formats and convert them into a format for easy and efficient use in python.
MIT License
8 stars 0 forks source link

Instructions for parsing additional Bible translations #66

Open the-pied-shadow opened 2 months ago

the-pied-shadow commented 2 months ago

With only two translations included in the pythonbible library, it would be really awesome to have more translations. The readme mentions that this module should be able to parse more OSIS xml files to allow more versions to be used by pythonbible but I can't figure out how to do it properly. I have a master list of xmls supposedly in OSIS format but the parser doesn't seem to parse them properly and they don't look like the xmls for the ASV and KJV currently int he pythonbible library. Could really use help with this it's the only thing regarding pythonbible that's making it not usable for my need.

the-pied-shadow commented 2 months ago

It's been hard to even find xml's for bible versions. Instructions on where to locate those would also be helpful.

avendesora commented 1 month ago

Hi!

Sorry for the delayed response! It's been a very busy year so far.

Is your list of XMLs available online? If so, please include a link and I will try them out. I only ever had ASV and KJV, so it's possible they did not include everything that is possible in the OSIS format, so there me be bugs (or missing features) in the parser.

I honestly don't remember where I found the ASV and KJV versions I used, but I will try to find that source again.

This is meant to be able to parse any OSIS file (and hopefully eventually support other formats), but I do want to be careful to make sure that only open-source or public domain versions are actually checked into this repository and bundled with the library to avoid issues with copyrights/licenses.

the-pied-shadow commented 1 month ago

Definitely understand wanting to avoid copyright infringment. I found this repo on GitHub which claims to be a "collection of freely licensed translations of biblical text in OSIS formt." However, their ASV and KJV XML's are not the same as yours so there is some kind of discrepancy. Hope this helps. Apologies for not including the link in the origonal post, that would have made sense.

This is a really great tool already and a phenomenal blessing. Thank you.

avendesora commented 1 month ago

Thanks! That is very helpful. I will try those out and will hopefully have an answer soon.

avendesora commented 1 month ago

Those OSIS files appear to be in a slightly different format, maybe a different version of the OSIS format. I was able to get them to parse with a few small tweaks, and have created a branch and a draft pull request with that work. I want to attempt to parse all of those OSIS files and run some tests before calling it complete though.

I apparently got my original OSIS files from eBible.org, though I can't find OSIS files on that site anymore. They do have USFX and USFM formatted Bible versions, and I'm also starting the work on parsing one or both of those formats. I'll move that work over to a different branch, though, so it can be worked on separately.

The files you sent me are a lot more simple in their content. They don't have any formatting information, the book titles, or pretty much anything extra beyond the Scripture text. So, I'll probably stick with the versions I have for KJV and ASV, but having all the other versions in multiple languages will still be very useful even without the extra stuff.