openenglishbible / Open-English-Bible

A CC0 licenced modern English translation of the Bible
http://openenglishbible.org
124 stars 35 forks source link

Jeremiah: fix a lot of typos, prevent words breaking over lines #327

Closed countingpine closed 3 years ago

countingpine commented 3 years ago

I accomplished most of this by generating a word list and checking out any wrong or suspicious words. I found and fixed a couple more things in the process, e.g. \p v 1 near the start.

I also removed all the hyphenated linebreaks. (In some cases this required some guesswork as to whether to keep the hyphen). There are now no hyphens at the end of lines (but plenty of m-dashes ).

I used the McFadyen scan from https://archive.org/details/jeremiahinmodern00mcfauoft as a reference, mostly referring to the OCR text, occasionally checking the PDF also.

(Most of the mistakes look like keyboard errors to me, suggesting the whole book was typed up at some point, rather than cleaning up an OCR version.)

I notice that in many cases the verse numbers don't appear in the right place on the line (indeed, in some places it came in the middle of hyphenated words). Or perhaps for poetic verses it could be said that it's the linebreaks that are in the wrong place.

Either way, fixing that is probably a massive, separate task.

openenglishbible commented 3 years ago

Hi countingping, this is great thanks. Before I merge, can you either confirm here, or send me an email to oebible@openenglishbible.org confirming you're happy for all your contributions to the OEB to be released under the CC0 licence? This is just so I have a record when we later start dealing with commercial publishers.

countingpine commented 3 years ago

Absolutely - I'm happy for my work to be licensed under CC0, and it's one of the reasons why I think the project is worth supporting.

(And particularly in this case, where there's barely anything in my changes for which I could claim any kind of authorship.)

openenglishbible commented 3 years ago

Thanks countingpine