sillsdev / ptx2pdf

XeTeX based macro package for typesetting USFM formatted (Paratext output) scripture files
21 stars 8 forks source link

Investigate spanning table columns #966

Closed davidg-sil closed 2 weeks ago

davidg-sil commented 1 month ago

It's not going to be easy.

davidg-sil commented 1 month ago

Currently, - is considered a letter, so that ranged milestones like \qt-s ... \* work. Given that \qt is a distinct style, interpretation, etc to \qt-s, this is a good thing. Thus \tc1-2 is and must be treated as a distinct USFM command.

The code currently recognises 9 columns fo7r table cells, and th thc, thr, tc, tcc and tcr as prefixes, i.e.. 9*6. If spaninning of 2 adjacent columns is allowed (while still only allowing 9 columns), this becomes (9*6 + 8*6) cell types. If all combinations of the 9 columns are allowed, the total is going to be 270 commands. This is probably managable for TeX, but they should be generated programatically, not by typing, to avoid the risk of undetectable typos.

davidg-sil commented 1 month ago

The current process is approximately as follows:

  1. Cells are read and put into column-stacks, with a macro saved for the row/column indicating the alignment.
  2. The maximum width of each cell in a column stack is determined
  3. The page width is divided according to the space-sharing algorithm.
  4. Cells are removed from the column stacks, and adjusted to the available width according to the saved alignment.

The present code uses quite an elegant formula which attempts to distribute widths fairly for columns that fit within their share of the page, while making adjustments according to the width of the widest material they contain (a narrow column donates its space to other columns). This however means that the column widths are not known until all the rows have been read. In turn, this means that a spanning column must not only be remembered, but it must be remembered that it is spanning multiple columns and it should be excluded from the normal calculations (although it could possibly contribute fractionally to all columns that it spans, this is not recommended as it would make an otherwise narrow column needlessly wide).

A modified approach might be:

  1. An additional column stack is allocated for multi-column spans. 1b. Cells that are in multi-column spans are stored in the special stack. A macro is written, identified by row/starting column, which stores how many columns the cell should occupy. 4b When a row/column macro exists, the text is removed from the special stack and split into lines according to the width of the combined column.
davidg-sil commented 2 weeks ago

Commit f13f203bc4 represents a first attempt at getting this working. No known bugs, but not very well tested.

\spanningcell{tc}{}{1}{2}
\spanningcell{th}{c}{1}{2}

Define markers \tc1-2 and \thc1-2 respectively.

davidg-sil commented 2 weeks ago

Actual approach does not use an additional stack, instead it writes the multi-column span as 0-width entries that get restored & wrapped to the combined column width