Closed DavidHaslam closed 6 years ago
The same USFM files contain numerous chapter label tags \cl CAPITULO nn
.
Another critical bug is that when these are converted to XML milestone elements, the element is not terminated with />
.
<chapter sID="Gen.1" osisID="Gen.1" n="1" />
<!-- cl --><milestone type="x-chapterLabel" n="CAPITULO 1
<verse sID="Gen.1.1" osisID="Gen.1.1" n="1" />En el principio creó Dios los cielos y la tierra.
This is also a new bug. Earlier versions of u2o.py
used to handle this tag correctly.
The title bug also affects the descriptive Psalm titles after processing the \d
tags.
<chapter sID="Ps.3" osisID="Ps.3" n="3" />
<!-- cl --><milestone type="x-chapterLabel" n="SALMO 3" />
<!-- d --><title type="psalm" canonical="true">Salmo de David, cuando huía de delante de Absalom su hijo.
Without seeing the usfm source I am not going to be able to do anything to fix this.
Both bugs are already present in release 0.6 which I just tried as a cross-check, to make sure that it's not due to one of your subsequent commits.
Unfortunately, I didn't retain a copy of my download from November 2017.
The attached Zip file contains my set of USFM files for your debugging.
I myself generated these files from the text files supplied by my contact.
They have not been checked with any Bible translation editing software, but they are fairly simple in structure. They contain no non-standard markers.
FIO: USFM tag statistics for the concatenated data.
Please let me know if you require any further information.
This happened because the reflow routine in u2o doesn't handle text without paragraph/poetry markers correctly. It make take me a while to fix this.
Understood. Essentially this is a Verse Per Line Bible version.
i.e. I plan to include Feature=NoParagraphs
in the SWORD module configuration.
The only few places that use \p
in this translation are the colophons at the end of each of the 14 Pauline Epistles.
As an interim workaround, I could insert \p
immediately before each \v 1
and to confirm whether that suppresses these errors.
Please don't use hacks to workaround bugs in u2o. Let me fix it. Further testings showed it's more than just a bug in the handling of texts without paragraph/poetry markers.
Adding \p
as proposed only solves the issues with \d
and cl
tags.
It didn't fix the issue with \qa
tags, albeit the first one was correct.
I guess that what's needed as a workaround is to mark each verse in Psalm 119 as poetry. Or at least, the first verse in each of the 22 stanzas.
Thanks for further advice. It was only a temporary hack in a conversion script (actually a bespoke TextPipe filter). Easy to revert.
FIO: Marking each stanza in Psalm 119 as either poetry or paragraph was indeed a successful workaround.
cf. Trying this was also useful for me to confirm that there were no further XML syntax errors or OSIS validation fails elsewhere in the file. That's good for peace of mind and planning next steps.
i.e. It gave me confidence that once you've succeeded in fixing this, then the task of module building should be fairly straightforward.
Ok, should be fixed now.
Thanks, Ryan.
Downloaded it and retested it with the unhacked USFM files. The OSIS file now validates.
Best regards,
David
Using the latest
u2o.py
downloaded yesterday, I just encountered a new critical bug.Psalm 119 acrostic tags
\qa
were not properly closed in the XML by</title>
.Likewise for each stanza.
Psalm 120 ended up being badly corrupted with
</title>
inserted multiple times within each<chapter sID ... />
and<verse sID ... />
element.Afterwards, Psalm 121 continues OK.