ubsicap / usx

Unified Scripture XML
30 stars 6 forks source link

Available parsers? #39

Open fjellandermedia opened 4 years ago

fjellandermedia commented 4 years ago

Excuse me if this is the wrong forum for this issue. If so, please close this thread (but I would in that case much appreciate a referral!)

I’m an independent developer in Sweden working a lot with different mobile and web apps för various church needs. I have encountered the USX format a few times already (developing apps for church needs often involved text from the Bible, who knew?!) and my own parsers of the USX data have been mediocre at best. To be fair I’m in no way an xml expert and my understanding of how to work with it is unfortunately limited.

So my question is: are there any resources available (parsers, xlst files or something) to convert USX format to html (preferably) or even plain text? I’ve seen Haiola, but it’s not entirely clear if that works flawlessly with USX since it’s built for USFX and I would also prefer something I can use in the command line (and I think Haiola is a gui app?). I’ve been duckduckgoing and googling all night but I haven’t found anything else. Any help would be greatly appreciated!

jonbitgood commented 4 years ago

It's about as basic as it gets but I've made a usx.xslt that's generally compatible with usx 3.0. It looks nice with the css from API.bible

It'd be nice to get a more functional version put together and pushed here.

fjellandermedia commented 4 years ago

Wow, this is great, Jon! Thanks a lot! This goes a long way! I agree that it would be great if there could be any “official” and more functional version at this repo. Maybe even a couple of ones, so you could choose if you’d like verse numbers, comments inline, etc. I think a lot of implementations of bible text in one way or another includes showing it as html.

jonbitgood commented 4 years ago

I think you could do that with just a single source file and alter the bible.css? That way users could choose which variations they like without the loss of data. This open source bible reader operates in that way.

Anyways, I'll work on learning xslt and xpath to make a more functional version of the usx.xslt once it's more fleshed out I'll submit a pull request.

klassenjm commented 4 years ago

@jonBitgood Thanks for replying. I have also inquired with some colleagues familiar with processing USX (including API.Bible). @fjellandermedia There is nothing else I can readily add to the repo here right now. I agree that it will be good to do so, and will welcome your PR.

shadow-light commented 2 years ago

Hi, just checking if anyone make any progress on this?

We also need a USX -> HTML converter (and will probably go ahead and build one if needed). It looks like a bulk of the work has been done already by api.bible with their open source styles: https://github.com/americanbible/scripture-styles/blob/master/scss/modules/_paragraphs.scss

So it looks like it's just a matter of converting XML tags to HTML approximations and adding classes that match the USX (which the stylesheets also use).

Is api.bible able to share their own code they used for doing the conversion?

jonbitgood commented 2 years ago

I've got a few inputs and outputs using xslt to convert to HTML, epub, pdf ect. They've got some domain specific information in them. But I'd love to extract that and collab on them. I'll get in touch.

ethanbarry commented 11 months ago

I see it's been a while, but I have created a very rough parser that outputs LaTeX source. Maybe someone can use this? I needed it for a typesetting project, and wrote it over the course of a week or so...

danzuep commented 8 months ago

Thanks @jonbitgood for the XSLT file example! I used a step by step conversion process demonstrad by haiola/BibleFileLib as inspiration to make a more parser-friendly XML file from USX. Here's a link to the folder. I've also added a PowerShell script in there you can use for the conversion if you're not au fait with C# .NET. I've only tested it with the book of Genesis from a new Creative Commons licensed bible from The Digital Bible Library but posting it here anyway as I'll update it if I find any issues.

jonbitgood commented 7 months ago

The XSLT file example has been expanded upon to account for various outputs - html, epub, sql, pdf, ect. It got a little custom and convoluted but maybe it'll be of help - all the code is open source and available here:

https://github.com/digitalbiblesociety/lamedh/tree/main/xslt