adyeths / u2o

USFM to OSIS bible format converter.
The Unlicense
18 stars 6 forks source link

What is u2o?

u2o is a fast python conversion program that is used to convert usfm formatted bibles to osis xml. Currently it targets version 3.0 of the usfm specification that bible translators use when translating scripture into different languages.

Why did I write it?

The SWORD Project has a script called usfm2osis.py that they use for converting usfm formatted bibles to osis xml for use with their software. Since I'm familiar with python, I decided to test it out to see how well it worked. It was the result of that testing that prompted me to write this alternative.

The Result

u2o is quite fast. For example, it only takes about 10 seconds to process the World English Bible on my old computer. That's about a 90% reduction in processing time compared with usfm2osis.py in my testing.

The output validates against the OSIS 2.1.1 schema. No markup errors are reported by osis2mod when generating modules for any of the bibles that I have access to at this time.

I've tested it and it works fine with recent versions of python3. It works but runs a lot slower with pypy3. Will NOT work with python2.

The Alternatives

There are of course other programs that convert usfm to osis. Here are the ones I am familiar with:

cu2o

This is a simple wrapper for u2o.py that will allow processing of usfm files that are concatenated into a single file. Consider it experimental. Note that it requires u2o in order to work.