drmacro / wordinator

Generate high-quality DOCX files using a simplified XML format (simple word processing XML).
Apache License 2.0
38 stars 8 forks source link

The `rowspan` attribute does not seem to be supported #100

Closed larsga closed 1 year ago

larsga commented 1 year ago

The attached zip file contains an .swpx file that uses rowspan in a table.

This image shows what the table should look like

image

This is how it's rendered by Word. As you can see, the rowspan is ignored:

image

table-stuff-20230114-Holman.zip

larsga commented 1 year ago

Looking at the code it seems like something is missing in the rowspan support:

      if (null != rowspan) {
        try {
          int spanval = Integer.parseInt(rowspan);
          CTDecimalNumber spanNumber = CTDecimalNumber.Factory.newInstance();
          spanNumber.setVal(BigInteger.valueOf(spanval));
          rowSpanManager.addColumn(cellCtr, spanval);
          CTVMerge vMerge = CTVMerge.Factory.newInstance();
          vMerge.setVal(STMerge.RESTART);
          ctTcPr.setVMerge(vMerge);
        } catch (NumberFormatException e) {
          log.warn("Non-numeric value for @rowspan: \"" + rowspan + "\". Ignored.");
        }
      }

The vmerge restart is set, but the actual rowspan number isn't used anywhere. The spanNumber variable is created, but then never used. Unfortunately, I don't know OOXML well enough to say exactly what should be done, but from this example it looks like the next cell vertically below should be added to the same <w:tc> in order to get the vertical merge.

drmacro commented 1 year ago

This test shows that rowspan works, at least some times (simplewpml-test-01.swpx):

    <tr>
      <td align="right"><p><run>R2C1 right-aligned</run></p></td>
      <td rowspan="2" valign="center" align="center"><p><run>Span 2 rows</run></p></td>
      <td><p><run>R3C3</run></p></td>
    </tr>

From the Schema docs for td:

NOTE: For vertical spans, all the cells spanned must be present in the row and must have the same @colspan value. The first cell in the vertical span specifies @rowspan with a value greater than 1. The subsequent cells may be empty or may include the <vspan> marker.

In the sample provided not all cells are accounted for in the rows, i.e., the second row only has 5 cells and it needs 8, with the first 3 cells containing (looks like the doc I quoted is wrong--empty is not sufficient--I'll correct that).

This markup produces the correct result for the header:

          <wp:thead>
            <wp:tr>
              <wp:td colspan="1" rowspan="2" borderstyle="none" colsep="0" rowsep="0">
                <wp:p style="Table body">
                  <wp:run> </wp:run>
                </wp:p>
              </wp:td>
              <wp:td colspan="1" rowspan="2" borderstyle="none" colsep="0" rowsep="0">
                <wp:p style="Table body">
                  <wp:run>Classification parameter</wp:run>
                </wp:p>
              </wp:td>
              <wp:td colspan="1" rowspan="2" borderstyle="none" colsep="0" rowsep="0">
                <wp:p style="Table body">
                  <wp:run>Dimension</wp:run>
                </wp:p>
              </wp:td>
              <wp:td colspan="5" rowspan="1" borderstyle="none" colsep="0" rowsep="0">
                <wp:p style="Table body">
                  <wp:run>Class</wp:run>
                </wp:p>
              </wp:td>
            </wp:tr>
            <wp:tr>
              <wp:td><wp:vspan/></wp:td>
              <wp:td><wp:vspan/></wp:td>
              <wp:td><wp:vspan/></wp:td>
              <wp:td colspan="1" rowspan="1" borderstyle="none" colsep="0" rowsep="0">
                <wp:p style="Table body">
                  <wp:run>B2</wp:run>
                  <wp:run style="Subscript">ca</wp:run>
                </wp:p>
              </wp:td>
              <wp:td colspan="1" rowspan="1" borderstyle="none" colsep="0" rowsep="0">
                <wp:p style="Table body">
                  <wp:run>C</wp:run>
                  <wp:run style="Subscript">ca</wp:run>
                </wp:p>
              </wp:td>
              <wp:td colspan="1" rowspan="1" borderstyle="none" colsep="0" rowsep="0">
                <wp:p style="Table body">
                  <wp:run>D</wp:run>
                  <wp:run style="Subscript">ca</wp:run>
                </wp:p>
              </wp:td>
              <wp:td colspan="1" rowspan="1" borderstyle="none" colsep="0" rowsep="0">
                <wp:p style="Table body">
                  <wp:run>s1</wp:run>
                </wp:p>
              </wp:td>
              <wp:td colspan="1" rowspan="1" borderstyle="none" colsep="0" rowsep="0">
                <wp:p style="Table body">
                  <wp:run>s2</wp:run>
                </wp:p>
              </wp:td>
            </wp:tr>
          </wp:thead>

image

drmacro commented 1 year ago

By adding the missing cells to the rows after the rowspan="6" I get the correct result: image

The transformation challenge is accounting for all the vertical spanning cells.

The DOCX generation does not try to do it--it requires that the table be fully populated.

drmacro commented 1 year ago

If it's any consolation, there's this comment in the out-of-the-box HTML transform for the table element:

    <!-- FIXME: For table rows, handle vertical spanning, which means examining preceding rows to look for
         vertically-spanning cells. Might be easiest to transform whole table into a complete matrix
         before generating the result table.
      -->
drmacro commented 1 year ago

In the simple2dita.xsl transform there is this template:

    <!-- Set up a pseudo-matrix to find the column of the current entry. Start with the first entry
     in the first row. Progress to the end of the row, then start the next row; go until we find
     the test cell (with id=$stop-id).
     If an entry spans rows, add the cells that will be covered to $matrix.
     If we get to an entry and its position is already filled in $matrix, then the entry is pushed
     to the side. Add one to the column count and re-try the entry. -->
  <xsl:template match="*" mode="find-matrix-column">
 ...
drmacro commented 1 year ago

That code might be useful to crib from if you don't already have something similar.

gkholman commented 1 year ago

Mea culpa! I've never heard of "vspan" before, nor have I seen it in table markup in other vocabularies such as OASIS, HTML, or CALS. My bad. Thanks for your patience.

larsga commented 1 year ago

This was a misunderstanding on our side, so closing the issue.