mwilliamson / python-mammoth

Convert Word documents (.docx files) to HTML
BSD 2-Clause "Simplified" License
811 stars 121 forks source link

Table header rows don't come through for tables #34

Closed dividor closed 7 years ago

dividor commented 7 years ago

Attached is a sample document with a table. Mammoth produces a table like this ...

The following variable…

Must be set to…

SET STUFF

Stuff 1

SET STUFF 2

Stuff 2

So it doesn't specify that the first row is actually a header row.

Could Mammoth perhaps set the header row tag for this scenario?

thanks! table-headers.docx

mwilliamson commented 7 years ago

Looks like we could read the w:tblHeader from the table properties.

Stikemanley commented 7 years ago

Hello,

This is actually a feature that interests me as well. I'm currently coercing the first row of all tables to be the header and am running into downstream problems now with tables that shouldn't have a header row but now do due to my coercion. So the ability of mammoth to identify a header row would be very useful.

Is there an update?

Thanks! Mike

mwilliamson commented 7 years ago

Is there an update?

This is still unimplemented.

mwilliamson commented 7 years ago

Could you try the latest commit on master and see if that has the behaviour you expect?

dividor commented 7 years ago

Great! Will give it a go and let you know.

Stikemanley commented 7 years ago

Hey,

Just tried it out and it works beautifully, thanks for your hard work!!!

Best, Mike

dividor commented 7 years ago

Yes, thanks loads!

mwilliamson commented 7 years ago

Glad it works.