lukeme / gobible

Automatically exported from code.google.com/p/gobible
1 stars 0 forks source link

Go Bible Creator removes glossary words tagged in USFM by \w_...\w* #161

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
Go Bible Creator removes glossary words tagged in USFM by \w_...\w*

The cause is systematic to do with how this particular USFM tag pair is parsed.

    \w_...\w*
    · Wordlist/glossary/dictionary entry.
    · Surround word(s) with this markup to indicate that it appears (or should appear) in the word list.

The problem is that the "\w" tag pair is placed in the wrong group for 
processing. The partial description below is from GBC_2.4_Readme.txt (attached 
fyi).

    4. Removal of the USFM tags which have a start and end tag but we want to keep the text inside of the markers (i.e., formatting type tags for bold, etc):
       "\qs", "\qac", "\add", "\dc",
       "\nd", "\ord", "\pn", "\qt", "\sig", "\sls",
       "\tl", "\em", "\bd", "\it", "\bdit", "\no", "\sc", "\k"
    5. Removal of the USFM tags which have a start and end tag and all the text that lies between (i.e., comment type tags):
       "\ca", "\va", "\vp", "\fe", "\bk",
       "\xdc", "\fdc", "\fm", "\fig", "\ndx", "\pro",
       "\wg", "\wh", "\f", "\w", "\x", "\rq", "\xot", "\xnt", "\iqt"

The "\w" tag should be in group 4, not group 5.

This is very worrying - as it must potentially affect several Go Bible apps we 
have made from USFM source text since April 2010.

Symptoms reported by The Translation Trust. See email for further details. 

NB. I have cc: this to Dan Hinton only FIO; in 2010, he was the programmer for 
GBC v2.4 - We can all make mistakes, and this one may go back even earlier.

Original issue reported on code.google.com by DFH...@gmail.com on 26 Nov 2011 at 8:43

GoogleCodeExporter commented 8 years ago
Daniel has fixed the bug and checked in the update to the SVN repo.

David has tested the fix and confirmed that it solves the issue.

David will edit the reference documentation, etc.

PS.  The Translation Trust has also been informed.

Original comment by DFH...@gmail.com on 28 Nov 2011 at 8:42

GoogleCodeExporter commented 8 years ago
Not many Go Bible applications would have been affected by the bug for two 
reasons:

(a) Not all Go Bibles are made from USFM source text
(b) The use of \w_...\w* is relatively rare, used only when the translation is 
accompanied by a glossary.

One that does spring to mind is the Ndebele Go Bible that I made for Teus 
Benschop.
I will check this and advise accordingly.

btw. This comment should be seen as a Containment Action for those familiar 
with 8D Problem Solving.  See 
http://en.wikipedia.org/wiki/Eight_Disciplines_Problem_Solving

Original comment by DFH...@gmail.com on 28 Nov 2011 at 8:49

GoogleCodeExporter commented 8 years ago
I've added Erik to the cc: list to be sure that our friends in SIL Dallas are 
made aware of this.

I will also send a memo to Jim Allbright in relation to the use of Go Bible 
software in the SIL Pathway project.  See http://code.google.com/p/pathway/

Original comment by DFH...@gmail.com on 28 Nov 2011 at 8:55

GoogleCodeExporter commented 8 years ago
Rebuilt the Ndebele Go Bible.  Will upload to box.net presently.

Original comment by DFH...@gmail.com on 28 Nov 2011 at 10:14

GoogleCodeExporter commented 8 years ago
Updated Ndebele Go Bible files now available in 
http://www.box.com/shared/ta29dr148f

Original comment by DFH...@gmail.com on 28 Nov 2011 at 10:18

GoogleCodeExporter commented 8 years ago
Thanks for updating the Ndebele Bible. I've made it available through
bibleconsultants.dyndns.org. Teus.

Original comment by teusjann...@gmail.com on 28 Nov 2011 at 7:10

GoogleCodeExporter commented 8 years ago
These special features tags should be treated in the same way as the fix for 
this issue.

"\ndx", "\wg", "\wh"

i.e. Move from group 5 to group 4, such that the tags are deleted but the text 
between them is retained.

Original comment by DFH...@gmail.com on 6 Dec 2011 at 3:10

GoogleCodeExporter commented 8 years ago
Those further tag pairs have been fixed in version 2.4.3 
committed to SVN on 8 Dec, 2011.

David to verify this with suitable USFM files.

Original comment by DFH...@gmail.com on 11 Dec 2011 at 6:40