sillsdev / ptx2pdf

XeTeX based macro package for typesetting USFM formatted (Paratext output) scripture files
23 stars 8 forks source link

Implement custom merge method #850

Closed davidg-sil closed 1 year ago

davidg-sil commented 1 year ago

Fundamental assumptions: (a) users have different ideas to the programmers about what sync points they might want to use, e.g. only sync at matching section titles. Their needs will get more complex with polyglot (b) They might have different ideas about what's good in a given translation globally or they might have specific setup for a particular configuration. (c) if they're setting up a particular configuration, they probably don't want to edit 2 (or N) different files.

I think this means that we have a config-specific custom-merge.cfg file per configuration, and a fall-back one per project. (and if they've not got one of those, and yet they select custom, the code could write them a default one, I guess).

The shared file should just have a single set of scores (in range 0-1) to be multiplied by the weighting factor that the UI will allow sometime.

The config-specific file can have sections for each different columns. (e.g. to ignore the paragraphs from that one, but take account of any section breaks). I guess those would be in different sections, like:

[L]
WEIGHT=50
preversepar=2
noversepar=1
versepar=1
heading=1
chapter=1
chapterhead=1
verse=0
[R]
WEIGHT=50
preversepar=0
noversepar=1
versepar=1
heading=1
chapter=1
chapterhead=1

(Implication: We want to copy the paragraph structure of L, because R has too many paragraphs. We ignore R's paragraphs normally, except when they are mid-verse)

davidg-sil commented 1 year ago

If the config file being used is from a higher level (project) directory, then a format more like this is appropriate, where the different configurations are listed:

[Default]
.
.
.
[Testing]
.
.
.

Or alternatively:

[custom]
.
.
[custom:nopars]
.
.

Where different custom options are configured might be appropriate.

My proposal is thus that the custom code should therefore do the following:

  1. Identify if there is a 'variety' of custom being selected (e,g, custom:nopars), If so, look for merge-nopars.cfg in the working directory, expecting 'L' / 'R' values as above.
  2. Look for merge-nopars.cfg in the configuration directory, expecting 'L' / 'R' values as above.
  3. Look for merge-nopars.cfg in the shared/ptxprint directory and paratext directory.
  4. Repeat 1-3, looking for merge.cfg instead.

If there is no variety, then the filename is merge.cfg in 1-3 above, and step 4 is omitted.

Within the config file, if the file-path includes the configuration, then look for the variety as a section, then, if appropriate, the column label ([L] or [R] as appropriate), then search for the the configuration name.