byztxt / byzantine-majority-text

Byzantine Majority Greek New Testament text edited by Robinson and Pierpont, with morphological parsing tags and Strong's numbers
The Unlicense
55 stars 13 forks source link

Betacode Conversion #1

Closed jcuenod closed 3 years ago

jcuenod commented 3 years ago

Is there a unicode conversion tool that works with this betacode? When I've tried with a couple of the usual packages I go to, they have stumbled on some of the characters (like ^).

Thanks

ReneNyffenegger commented 3 years ago

James, I have written my own PowerShell betacode decoder module for the very purpose of parsing this betacode. You find it via my home page ( https://renenyffenegger.ch/notes/Windows/PowerShell/modules/personal/betacode/index ) or directly at https://github.com/ReneNyffenegger/ps-modules-betacode. You might find it somewhat useful for your purposes.

jcuenod commented 3 years ago

Hmm, thanks! My solution was to do minimal conversions required to standardise the betacode so that beta-code-js can parse it:

/* Adds asterisk to capital letters (doesn't work on uppercase letters in the middle of a word */
const injectAsterisk = str => str.replace(/\ ([^ {]*[A-Z])/g, " *$1")
/* Convert circumflex from ^ to = */
const replaceCircumflex = str => str.replace(/\^/g, "=")

const fixBetaCode = str => replaceCircumflex(injectAsterisk(str))

I haven't pushed it yet but it'll end up here eventually: https://github.com/jcuenod/parabible-data-pipeline/tree/master/gk-byz-pipe

emg commented 3 years ago

Hi @jcuenod, @ReneNyffenegger and @normansimonr

thanks to the work of @normansimonr, a Unicode version has been created from the Beta code CCT files. Check out his work:

https://github.com/byztxt/byzantine-majority-text/tree/master/csv-unicode

Best wishes,

Ulrik Sandborg-Petersen

jcuenod commented 3 years ago

Thanks @emg!