lukeme / gobible

Automatically exported from code.google.com/p/gobible
1 stars 0 forks source link

GoBibleCreator omits the first word of a verse if the space is missing after the USFM tag \v #77

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
Example:

USFM file (Shona) has in Exodus \c 36

\v 31Akaitawo mbariro dzomuakasia, shanu dzamapuranga orumwe rutivi
rwetabhenakeri,

The space is missing between verse number and tag!

Go Bible ends up with 

31mbariro dzomuakasia, shanu dzamapuranga orumwe rutivi rwetabhenakeri,

Original issue reported on code.google.com by DFH...@gmail.com on 9 Nov 2009 at 3:13

GoogleCodeExporter commented 8 years ago
The real issue for GBC is that there is no error message in the log file.
Thus the JAR file might be thought to have been made with no errors, even 
though it
has words missing. Such errors can remain undiscovered for a long time.

The workaround is for the translator to correct the USFM file and insert the 
missing
space.

Original comment by DFH...@gmail.com on 9 Nov 2009 at 3:15

GoogleCodeExporter commented 8 years ago
Teus has also added this as a bug in Bibledit. See
https://savannah.nongnu.org/bugs/?27985

NB. Register and login if you can't read this link.

Original comment by DFH...@gmail.com on 8 Dec 2009 at 4:13

GoogleCodeExporter commented 8 years ago
This probably should be a different issue, but it's related and I'll note it 
here 
and DFHMCH can make a new issue if it should be.

In looking at the code, the actual verse numbers that were in the data are not 
used 
in the display in the device. If I had a KANNADA DIGIT ONE (U+0CE7) in my data, 
I 
would still get the ASCII DIGIT ONE (U+0031) on my display. This is the same 
for all 
the other Unicode Digits: ARABIC-INDIC, EXTENDED ARABIC-INDIC, TELUGU, 
DEVANAGARI, 
BENGALI, GURMUKHI, ... MYANMAR, ETHIOPIC, ... and many many more.

It would be nice for the display code to use the same Digits in display as were 
used 
in the data file. 

This would require a little reworking of the verse processing (not searching 
for 
spaces, keeping the digits ... etc), but it shouldn't be too difficult. 

Original comment by dhinton...@gmail.com on 8 Dec 2009 at 10:58

GoogleCodeExporter commented 8 years ago
If the code were changed to maintain the encoded digits the space problem after 
the 
verse number would not be a problem. I wouldn't call it a fix only, but it 
would fix 
this problem - that's why I listed it here. (Sorry for the confusion.)

Original comment by dhinton...@gmail.com on 8 Dec 2009 at 11:00

GoogleCodeExporter commented 8 years ago
In USFM files, the space should simply not be missing after the verse number 
tag!
We can pursue Bibledit programmers for a solution to check for such errors.

The real issue for Go Bible Creator is that it fails to include the first word, 
and
also fails to report an error message. So you are lulled into a false sense of
success, when there has been a serious error in reproducing the Biblical text.

There is already a separate issue about using a different set of Unicode digits.

See http://code.google.com/p/gobible/issues/detail?id=24

I think we should take that issue forward sometime, especially as it would be 
much
easier to test. For example, if the Unicode digits to be used were defined by an
optional item in the collections text file, one could build even the English 
KJV with
Devanagari digits! This means you don't need to have (say) a Hindi Bible in USFM
format in order to test the feature.

Original comment by DFH...@gmail.com on 9 Dec 2009 at 1:53

GoogleCodeExporter commented 8 years ago
Priority changed.  We will not fix this in version 2.4.0 release.

Original comment by DFH...@gmail.com on 27 Mar 2010 at 3:49

GoogleCodeExporter commented 8 years ago
Won't fix this, as the USFM files should be checked and validated before using 
Go Bible Creator, and if necessary, fixed by preprocessing methods.

Original comment by DFH...@gmail.com on 31 Dec 2012 at 2:36