python-openxml / python-docx

Create and modify Word documents with Python
MIT License
4.38k stars 1.08k forks source link

How to serialize a CT_Tbl object? #1379

Closed wennycooper closed 2 months ago

wennycooper commented 2 months ago

I would like to serialize a CT_Tbl object using pickle in python. But I got:

== code == import pickle pickle.dumps(tbl) => TypeError: cannot pickle 'CT_Tbl' object

==== Any suggestion will be appreciated. Thanks!

scanny commented 2 months ago

CT_Tbl is an lxml custom element class. The natural serialization form would be XML, since that's what it is.

All these classes in python-docx that start with CT_ are this way and they subclass lxml.etree._Element, so any of the serialization methods applicable to _Element will work.

Easiest is probably tbl.xml. All the CT_ classes have the .xml property.

wennycooper commented 2 months ago

Would you please give us more hints? Say, is there any serialization method applicable to _Element? Can we use pickle to do that?

scanny commented 2 months ago

Say more about what you're doing that requires you to serialize a table.

lowellknight1114 commented 2 months ago

We want to read a table from a source doc and copy it to a destination doc by retrieving the XML string from table._tbl.xml

doc = DocumentDocx(filename)
table = doc.tables[0]
xml_string = table._tbl.xml

## save xml_string to a file
## read xml_string from a file
#??? how to implement a CT_Tbl object using an XML string.
#add_tbl: CT_Tbl = ???

doc_new = DocumentDocx()
paragrap = doc_new.add_paragraph()
paragrap._p.addnext(add_tbl)

We can't figure out how to convert XML to a CT_Tbl object as we do with etree.fromstring(xml_table). Can you help us with this?

scanny commented 2 months ago

I think what you're looking for is docx.oxml.parser.parse_xml(xml: str | bytes) -> BaseOxmlElement.

lowellknight1114 commented 2 months ago

Yes, this is what we need, Thanks.