schierlm / BibleMultiConverter

Converter written in Java to convert between different Bible program formats
Other
124 stars 33 forks source link

SWORD to Zefania XML, without Note tags and correct bsname? #11

Closed j2l closed 5 years ago

j2l commented 7 years ago

OK, I found so many mistakes in Bibles found here and there (missing verses, book name missing, ...), I'm re-converting from Sword. The terrible thing is that the same version is copied, again and again with errors everywhere, for example Smith and van Dyck's al-Kitab al-Muqaddas, all files have the same error on all versions I found. To help others (I didn't understand the passage about SWORD in the readme at first), here's the command: java -jar BibleMultiConverter-AllInOneEdition.jar SWORD modules\texts\ztext\arasvd ZefaniaXMLMyBible arasvd_zef.xml osisID is correct but I get NOTE tag in the way for verse 1 of each chapter only:

<BIBLEBOOK bname="Genesis" bnumber="1" bsname="Gen">
    <CHAPTER cnumber="1">
      <VERS vnumber="1">
        <DIV>
          <NOTE type="x-studynote">&lt;p&gt;<DIV>
              <NOTE> <BR art="x-nl"/>
              </NOTE>
            </DIV>&lt;/p&gt;</NOTE>
        </DIV>فِي الْبَدْءِ خَلَقَ اللهُ السَّمَاوَاتِ وَالأَرْضَ.</VERS>

My problem is that &lt;p&gt; &lt;p&gt; is interpreted as part of the verse. So I'd prefer without note at all. It's limitative, but I'll work on this features in JSON in the future.

The other Zefania export format doesn't handle osisID well: <BIBLEBOOK bname="Genesis" bnumber="1" bsname="Genesis"> while bsname should be Gen like in ZefaniaXMLMyBible.

Is there a way to get the most basic version like:

<BIBLEBOOK bname="Genesis" bnumber="1" bsname="Genesis">
    <CHAPTER cnumber="1">
      <VERS vnumber="1">فِي الْبَدْءِ خَلَقَ اللهُ السَّمَاوَاتِ وَالأَرْضَ.</VERS>

? God bless you!

schierlm commented 7 years ago

Hello Phil,

To get rid of features, you can do a second conversion to StrippedDiffable format.

java -jar BibleMultiConverter-AllInOneEdition.jar SWORD modules\texts\ztext\arasvd StrippedDiffable tmp.txt StripFootnotes StripHeadlines StripFormatting

It will output what it stripped and what other features you could strip too (you can add as many features to strip to the end as you want to). In this example, footnotes headlines and formatting are stripped. Then convert as a second step from Diffable:

java -jar BibleMultiConverter-AllInOneEdition.jar Diffable tmp.txt ZefaniaXMLMyBible arasvd_zef.xml

bsname in Zefania XML is supposed to be (copied from the specification):

This attribute holds the book book name in short form, e.g. "Gen", in the language of the Bible book.

So, while bsname in an English Bible may be "Acts", it will be "Apg" for a German Bible. The way to detect it is actually the Book of Acts is via the bnumber attribute (which is 44).

However, you can use the attached XSL to rewrite the bsname based on bnumber if you prefer.

bsname-osis.xsl.zip

I'll leave this issue open - feel free to close it if it answers your questions :)

j2l commented 7 years ago

I tried and I still get NOTE in the way. It's not present in the tmp.txt so I guess it's added in the converter, only first verse of each chapter. The other ZefaniaXML export still write bsname="Genesis" too.

Thank you so much for the xsl. Very useful indeed!

schierlm commented 7 years ago

Ah sorry, did not notice you used ZefaniaXMLMyBible format. If you do not want to import into MyBible (www.mybible.de) I would not recommend this format, as it contains several quirks to work around bugs in MyBible:

So you'd probably have more work "fixing" these workarounds in your XSL later, than just starting with Zefania XML and implementing the workarounds yourself.

In case your only reason to use this format was to get OSIS-like IDs in the bsname, I guess it is easier to use my XSL instead :)