sillsdev / ptx2pdf

XeTeX based macro package for typesetting USFM formatted (Paratext output) scripture files
23 stars 8 forks source link

Please elegantly handle 2 pictures on one verse #232

Closed davidc86 closed 3 years ago

davidc86 commented 4 years ago

In Genesis and James (so far) we have verses which have 2 illustrations associated with them. Here is the one from James 3:

\v 12 Lemade waikw o hatnimakw e, kyalamo aw arake kisi aw zaitunke kisike e? Ta kyalamo tasy maiskye kisi aw a arake kisike e? Lema! Kola dene wer a mtelaske lema bisa ma kbyetik kyosy wer a mkeske kmatake dakun. \m \fig Irkye byul aw arake kisike|alt="A person picks the fruit of a fig tree" src="LB00085B.TIF" size="col" ref="3:12"\fig \m \fig Irkye syulw kolakye ma aw zaitunke kisinare kimin ti|alt="A person uses a head basket to carry fruit from the olive tree" src="LB00087B.TIF" size="col" ref="3:12"\fig

This verse talks about fig trees and olive trees. These are unknown trees in our area. So both are illustrated here. It is pretty arbitrary that Paratext users can't put two pictures on one verse simply because the ptx2pdf macros can't handle this. Rather PTXprint should elegantly handle this situation.

Current PTXprint behavior simply DELETES one of the illustrations and doesn't even list it in the PicList table. So an inattentive user might likely won't even notice that some of his illustrations have been deleted.

Minimally PTXprint should notify the user that "One of the illustrations at Bk. P:v has been deleted because only one illustration per verse is allowed in PTXprint."

It would be MUCH better though, if the PicList table would show ALL the \figs listed in all the books in the configuration, but to mark those that have issues some how, like in RED or with an X box (and ideally with a Note saying why it has issues). Then the user could keep the REF label as is in Paratext, but change the Anchor verse reference to an adjacent verse directly in the table, thus keeping the two illustrations, and keeping them close together, but satisfying a program limitation. After the user changes the Anchor verse, PTXprint would change that entry to black. (Or use Check boxes and X boxes -- whatever is easiest and communicates the best).

(An extension of the check box / X box idea would allow the user to UNCHECK illustrations they want skipped in this configuration--without having to tweak the PTX SFM files.)

Current behavior (with a PicList table and no obvious way to edit \fig entries that ARE NOT there) has me a bit puzzled how to fix this so I get what I could easily do in earlier versions of PTXprint. (The problem is Paratext does NOT provide a different "anchor point" from the "Reference" verse. This limitation means I can't get 2 illustrations pointing to the same verse, but are anchored to adjacent verses.)

I think the most Ideal solution is to just get the macros to allow for multiple illustrations on a verse, so the user doesn't need to worry about it. But that might be too hard to do programmatically.

davidc86 commented 4 years ago

I found that I had to go outside of PTXprint and find the old PicList file for James in a much earlier configuration test, and basically recreate that information in the new James PicList file using the newer structure. Then I reopen PTXprint and it loaded the changes successfully, so now I see both illustrations anchored to different verses but referencing the same verse. (Not for the normal user.)

davidg-sil commented 4 years ago

Quite why the piclist handling code is failing, I don't know. I've just edited a piclist outside ptxprint to have a second picture at the same anchor, and I can confirm that the first picture is indeed deleted (no matter the piclist sequence). I've also edited a piclist within ptxprint and only the second picture at a given verse is saved. I wonder if it might be some kind of accidentally-destructive sorting process?

To set the record straight, the ptx2pdf macros now cope with multiple illustrations on one verse, and out-of-order piclists (I've tested them with 3 and the code to check the next sequence should cope with any number less than TeX's value of max_integer, not that X-thousand pictures makes any sense). I actually see no reason why they should have ever had issues handling multiple illustrations from one verse not from a piclist, either. The explanation given in the old documentation never rang true to me, I wonder if it was a case of misidentification of another bug.

If you find references to this restriction in current versions of any documentation that we control, please tell us, so we can remove it, it is incorrect.

davidc86 commented 4 years ago

We discussed this elsewhere, and for simplicity of the picture Edit Details window, disallowing two pictures to use the same anchor ref is reasonable.

What is not cool is that Paratext allows us to associate 2 pictures to one verse reference, and the \fig field does not differentiate between "anchor ref." and "caption ref." -- and this is the main problem. Paratext allows it, so PTXprint needs to elegantly disallow it :) -- My suggestion is when the first picture is encountered while reading the SFM file, its caption ref is used as the anchor ref too. But when the next picture is encountered with the same caption ref, that ref stays in the caption ref. field, but the anchor ref is index to the next verse (or maybe previous in case its the last verse?). The point is users need both captions to point to the same verse, and they need both pictures to be near each other. And they don't want one of their pictures to be discarded because of an incompatibility between Paratext and PTXprint.

mhosken commented 4 years ago

There is a wider discussion to be had about the whole concept of out of band information referencing locations in a .usfm file. Currently we are using BKK C.V but we may need to go finer than that. This may involve a subtle change to the .piclist format (uh oh), and anchors.

Note the contrast between anchors (as you describe as caption ref) and refs (anchor refs).

Currently we are hampered because of our limited anchor format. I think this needs to extend and then we could have more than one fig per verse (which would fit with USFM storing more than one fig per verse). Anyway. This is a good place to start that discussion.

mhosken commented 4 years ago

Proposal 1.

Anchors increase to BKK C.V anchorid which then allows for someone to insert \a anchorid\a* into their text. Clearly anchorid is optional, but you wouldn't be able to have more than one entry in a fig list with the same anchor, so if you want more than one fig per verse, one would need to be anchored. We could also beef up \fig to either support an anchor or to receive an implicit f1 f2 f3 anchor.

This improved anchoring scheme would allow us to specify other publication specific information, for example setting paragraph categories from outside and perhaps doing away with the very odd syntax we have for adjustment lists.

We could also have implict anchors at the start of each paragraph following a verse. Thus GEN 3.1 1 is the start of the paragraph after the paragraph containing GEN 3.1.

As I say, just a proposal.

davidg-sil commented 4 years ago

Proposal 2 (markup).

USFM3 allows \foo|attributes\* type 'milestones' (as well as \foo-s\* and \foo-e\* start and end 'milestones' for defining a range) All milestones are allowed an id= attribute, and apparently all character styles are allowed to be 'subverted' into milestones. It therefore seems unlikely that anchors are likely to get support in USFM, or if they are supported, they will become a milestone-type.

As we are planning to implement attribute support, and (attribute-free) milestones are almost working (expect a push in next day or two), I suggest that where we NEED anchors, we use milestone-format. e.g. \za|id=anchorid\*

Caveat.

The use of anchors in the USFM ruins the point of piclists. I don't support anchors for biblical text where we have natural chapter.verse anchors. The reason for suggesting it seems to be to overcome a difficulty in keeping track of piclist entries within the python code. I therefore propose the following.

davidg-sil commented 4 years ago

Proposal 3 (internal id)

Since the (main) issue as far as I understand it, is for ptxprint python code to have a unique id for each enty that its editing, so that it can keep track of things (very reasonable need!), let there be a unique x-id="sfmJOH12.34a" code that references where the original entry came from (in case the user changes that) i.e. ones that come originally from SFM get a prefix of SFM, ones that got added manually get a different ID. This reference would be copied verbatim anywhere that the piclist is written out. It could of course also include a timestamp (minutes past 2020-01-01 midnight?) that could allow a 'another project updated this image, do you want those changes here too?' type dialogue. Options: Synchronise / leave it alone / "break link so this is a new image"

markpenny commented 3 years ago

Thanks to some skillful work by @mhosken this is now possible. Refer to #309 to see an example. Should be available in 1.4.1 within the next day or two.