sillsdev / ptx2pdf

XeTeX based macro package for typesetting USFM formatted (Paratext output) scripture files
23 stars 8 forks source link

First paragraph gets sucked up into the main title when printing verse range #866

Open JimSmith763268 opened 1 year ago

JimSmith763268 commented 1 year ago

I have found that when I try to print a verse range, the first paragraph of text gets sucked up into the main title formatting.

I'm not sure how widespread this problem is; I encountered it when I was trying to print a passage starting at the beginning of a chapter. Here's an MWE:

\id EXO
\h header
\mt1 main title
\c 1
\cl Chapter 1
\s section head
\p
\v 1 verse one
\v 2 verse two.
\v 3 verse three
\p
\v 4 verse four

When I run the range EXO 1:1-4, this produces a title with “main title 1verse one 2verse two. 3verse three”. Then following the paragraph break the text seems to be formatted correctly. The section head also gets lost.

Version 2.3.9

davidg-sil commented 1 year ago

In my testing just now (2.3.12), what I'm seeing is the following internally consistent behaviour. Given an input of:

\id JHN - John 
\h JOHN
\toc1 John
\mt3 The Gospel of 
\mt1 John
\c 1
\s1 A title
\p
\v 1 In the beginning...
\v 2 etc

And a range of JHN 1-2, there is no 'corruption'. Nor is there if I say JHN 1:0-5. If, however, I say JHN 1:1-3 then everything between the \mt1 line and the \v 1 (exclusive) is excluded.

This is not very intuitive, but it makes a certain amount of sense... the \c 1, \s1 and \p all fall within the category of 'Chapter 1, stuff before verse 1', which I guess it is calling verse 0.

My own thought is that if someone specifies JHN 2:5-8 (which behaves in the same way), they would probably expect that the chapter number and last-met paragraph before verse 5 would be nevertheless included (especially since it's possible to turn off chapter numbers elsewhere in the UI), but they would probably not expect a section title that occurred before verse 1.

I started saying the code is internally consistent but actually, I've just found it is not entirely so: If I select JHN 2:13-3:4 then the title block just before verse 13 is being included. This seems to be skipped if the starting verse number is 1. (2:1 misses the title too).

Another challenge to the code, If I specify JHN 2:0-0, hoping to get the chapter number, it includes the entire chapter. It seems that '-0' means "to the end"

markpenny commented 1 year ago

I guess this is related to #812

davidg-sil commented 11 months ago

@JimSmith763268 There was a fix for https://github.com/sillsdev/ptx2pdf/issues/894 which might have fixed this issue as well. Could you please check-n-confirm, and close the issue if all is well now. (Note that the auto-inserted \zsetref code needed to fix image inclusion might cause issues when exporting USFM to SAB, so keep your eyes open for surprises there.)

JimSmith763268 commented 11 months ago

I just tried it with version 2.3.51. When I typeset EXO 1:1-4, it printed the title of the book, and then the four verses.

I don't get section headings or chapter numbers (even though those boxes are ticked in the GUI). I don't have the documentation in front of me to know whether that is expected behavior. If it is then I will close this (or you can). If I do EXO 1 then I get the chapter number and section headings.

If it is expected behavior, it would be nice for those options to be grayed out in the GUI when you enter in a verse range instead of a whole book or chapter.

davidg-sil commented 11 months ago

I wouldn't call it the expected behaviour, I'd call it a long-standing bug! Unfortunately I don't speak python well enough to try fixing it.

Worked example of what I'd expect. (not what happens now, it seems).

I know that opinions differ (opinions, @mhosken , @markpenny ?), but what I think I would expect / find reasonable would be:

input, (and final SFM for JHN 0:0-1:28)

\id JHN - John 
\h JOHN
\toc1 John
\mt3 The Gospel of 
\mt1 John
\is The book of John
\ip In this book...
\iot Outline
\iop Christology \ior 1:1-1:28\ior*
\c 1
\ms The Word Made Flesh
\mr 1:1 -- ....
\ip In this section, the author .... 
\s1 A title
\p
\v 1 In the beginning...
\v 2 etc
...
\v 28 

Final SFM for range 1:0-28

\id JHN - John 
\h JOHN
\toc1 John
\mt3 The Gospel of 
\mt1 John
\c 1
\ms The Word Made Flesh
\mr 1:1 -- ....
\ip In this section, the author .... 
\s1 A title
\p
\v 1 In the beginning...
\v 2 etc
...
\v 28 

Final SFM for range 1:1-28

\id JHN - John 
\h JOHN
\toc1 John
\mt3 The Gospel of 
\mt1 John
\c 1
\s1 A title
\p
\v 1 In the beginning...
\v 2 etc
...
\v28 

Rationale

Note that this is the 'pre-selection' phase. Subsequent modifications turning on and off introductory headings is expected to happen after this, and change whether bits of it are counted as 'hidden content' or not.