lukeme / gobible

Automatically exported from code.google.com/p/gobible
1 stars 0 forks source link

Display errors in KJV red letter edition - when made from USFM source text from ebible.org #135

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. Created KJV from the USFM source text downloaded from ebible.org
2. Matthew 5:13 is an example of the display error
3. There are many more instances: Mt 5:17, Mt 5:21, Mt 5:27, etc

What is the expected output? What do you see instead?
Should see verse 13 start after the end of verse 12.
Instead, the first line of verse 13 overlaps the last line of verse 12.

In the Sermon on the Mount, Matthew 5:3 to 7:27 should all be red letter.
The display problem is also that some verses have reverted to black.

Please use labels and text to provide additional information.
The Go Bible was made using version 2.4.0

I have attached the collections text file, which was used to make several 
collections. Programmers investigating the issue can easily use this to create 
the same Go Bible apps.

Original issue reported on code.google.com by DFH...@gmail.com on 10 Oct 2010 at 3:31

Attachments:

GoogleCodeExporter commented 8 years ago
Another aspect of the issue is that some verse numbers are displayed in red.
They should not be.

Original comment by DFH...@gmail.com on 10 Oct 2010 at 3:32

GoogleCodeExporter commented 8 years ago

Original comment by DFH...@gmail.com on 10 Oct 2010 at 3:33

GoogleCodeExporter commented 8 years ago
Analysis: 

The USFM source text files from ebible.org contain many superfluous extra 
instances of the Words of Jesus end marker \wj*

\v 12 \wj Rejoice, and be exceeding glad: for great \add is\add*\wj*\wj  your
reward in heaven: for so persecuted they the prophets which were before you.
\wj*
\p
\wj*
\v 13 \wj Ye are the salt of the earth: but if the salt have lost his savour,
wherewith shall it be salted? it is thenceforth good for nothing, but to be
cast out, and to be trodden under foot of men. \wj*

These are processed by Go Bible Creator in the normal way, causing the red 
letter attribute to toggle incorrectly.

It looks as if the files from ebible.org may not be strictly compliant to the 
USFM standard. One workaround might be to preprocess the files, to remove the 
superfluous extra markers.

A solution might be for GoBibleCreator to ignore repeated occurrences of the 
start marker \wj and repeated occurrences of the end marker \wj*.

Original comment by DFH...@gmail.com on 10 Oct 2010 at 4:21

GoogleCodeExporter commented 8 years ago
Counts for the pattern 
\wj*
\p
\wj*

40-MAT-kjv.ptx    86 occurrences
41-MRK-kjv.ptx    37 occurrences
42-LUK-kjv.ptx    74 occurrences
43-JHN-kjv.ptx    27 occurrences
44-ACT-kjv.ptx     1 occurrence
...
66-REV-kjv.ptx    10 occurrences

This pattern may be an artefact of the software that Kahanapule Michael Johnson 
has used to generate the USFM files from another format.

Original comment by DFH...@gmail.com on 10 Oct 2010 at 4:28

GoogleCodeExporter commented 8 years ago
Another redundant pattern is when the end marker is followed immediately by the 
start marker.  A search for "\wj*\wj " gives

40-MAT-kjv.ptx   241 matches

Also, the start marker \wj usually has two spaces after it, rather than a 
single space. There are way too many double spaces in these files.

Original comment by DFH...@gmail.com on 10 Oct 2010 at 4:50

GoogleCodeExporter commented 8 years ago
Further analysis (First Gospel for illustration):

40-MAT-kjv.ptx

"\wj "   987 matches
"\wj*"  1079 matches

Thus there are 92 unpaired instances of the end marker.

Original comment by DFH...@gmail.com on 10 Oct 2010 at 8:07

GoogleCodeExporter commented 8 years ago
The GoBibleCreator issue is that it does not detect unpaired markers in USFM 
for those that should always be paired. No error message is generated.

Original comment by DFH...@gmail.com on 10 Oct 2010 at 8:08

GoogleCodeExporter commented 8 years ago
Workaround implemented. 

This morning, I developed a TextPipe Standard filter to remove the unpaired and 
redundant wj markers in the source text files. This has fixed the display 
problem.

As a follow up, I should email Michael Johnson to inform him of this issue.

Original comment by DFH...@gmail.com on 11 Oct 2010 at 9:58

GoogleCodeExporter commented 8 years ago
The "export to clipboard" TextPipe filter is in the attached file. (FIO)
The binary .fll file is available upon request.

Original comment by DFH...@gmail.com on 11 Oct 2010 at 11:29

Attachments:

GoogleCodeExporter commented 8 years ago
I have just emailed Michael Johnson to inform him of this issue.

Original comment by DFH...@gmail.com on 11 Oct 2010 at 11:40

GoogleCodeExporter commented 8 years ago
Closed this issue relating to one particular set of USFM files from ebible.org

Added new issue 136. See
http://code.google.com/p/gobible/issues/detail?id=136

Original comment by DFH...@gmail.com on 11 Oct 2010 at 12:36