Open enrac5 opened 11 months ago
Identifying merged cells is not enough by itself. I think you'll find you need to identify "root" cells and "spanned" cells.
A root cell (my term) is the upper-left cell in a merge. All the other cells in the merge are spanned.
Something like this will only produce root cells. An unmerged cell can be thought of as a root-cell of its own with no spanned cells.
from typing import Iterator
from docx.table import Table, _Cell
def iter_table_cells(table: Table) -> Iterator[_Cell]:
"""Generate each "visible" cell in `table`.
Note that not all rows will necessarily have the same number of columns and
a row can start in a column later than the first if there is a vertical merge.
"""
for row in table.rows:
tr = row._tr
for tc in tr.tc_lst:
# -- vMerge="continue" indicates a spanned cell in a vertical merge --
if tc.vMerge == "continue":
continue
# -- --
yield _Cell(tc, row)
I have a doc (a simple snippet is attached to this issue) and I'd like to detect the merged cells after row 1. Right now, I'm just doing a check if the first two cells are the same content, but that seems not ideal (based on the discussion here https://github.com/python-openxml/python-docx/issues/1311). butt_merged.docx
What's a better way of checking for merged cells?