pqzx / html2docx

Convert html to docx
MIT License
69 stars 49 forks source link

Color conversion not done when a text is in bold or italic #21

Open BreakerZero opened 2 years ago

BreakerZero commented 2 years ago

As said in the title, I can't get the color value when a text is in bold or italic, it always returns None What I tried:

test.py.txt

My original html rendering:

html

My docx rendering : myexemple

I have the impression that this comes from the fact that HTML2docx does not take into account the possibility that a style can be written in a <strong> or <i> tag. Any solution to solve this? (I'd like to avoid nesting everything in a span tag every time if possible.)

pqzx commented 2 years ago

Yeah, restricting style to spans isn't really ideal. For now a hacky solution is to add the following

if 'style' in self.tags[tag]:
    style = self.parse_dict_string(self.tags[tag]['style'])
    self.add_styles_to_run(style)

under the final for loop in handle_data() https://github.com/pqzx/html2docx/blob/9337c3950bee62a4fea5b722e7ba19c163df4d9f/htmldocx/h2d.py#L550