dfop02 / html4docx

Convert html to docx
MIT License
7 stars 2 forks source link

Does this not support inline styles of html (consider richtext editor output)? #7

Open berkio3x opened 1 month ago

berkio3x commented 1 month ago

My html contains styling information too like

Generated docx did not have any style applied.

Happy to contribute for this feature if help needed.

code


from html4docx import HtmlToDocx

def hh(html):
    new_parser = HtmlToDocx()
    docx = new_parser.parse_html_string(html)
    docx.save("h4.docx")

example html

<p style=\"margin-left:40.5pt;\">Following courts were checked across India, Mauritius and Singapore&nbsp;</p>
<figure class=\"table\">
    <table style=\"border-collapse:collapse;margin-left:.5in;\" border=\"1\" cellspacing=\"0\" cellpadding=\"0\"
        width=\"641\">
        <tbody>
            <tr style=\"height:23.75pt;\">
                <td style=\"background-color:#3749EF;border-bottom-color:#BFBFBF;border-left-color:#3749EF;border-right-color:#3749EF;border-style:solid;border-top-color:#3749EF;border-width:1.0pt;height:23.75pt;padding:0in;width:258.35pt;\"
                    width=\"344\">
                    <p style=\"text-align:center;\"><span style=\"color:white;\"><strong>CATEGORY</strong></span></p>
                </td>
                <td style=\"background-color:#3749EF;border-bottom:1.0pt solid
                    #BFBFBF;border-left-style:none;border-right:1.0pt solid #3749EF;border-top:1.0pt solid
                    #3749EF;height:23.75pt;padding:0in;width:222.2pt;\" width=\"296\">
                    <p style=\"margin-left:.7pt;text-align:center;\"><span
                            style=\"color:white;\"><strong>OBSERVATIONS/COMMENTS</strong></span></p>
                </td>
            </tr>
            <tr style=\"height:15.5pt;\">
                <td style=\"background-color:#BFBFBF;border-bottom-style:solid;border-color:#BFBFBF;border-left-style:solid;border-right-style:solid;border-top-style:none;border-width:1.0pt;height:15.5pt;padding:0in;width:258.35pt;\"
                    width=\"344\"><strong>NETHERLANDS COURTS</strong></td>
                <td style=\"background-color:#BFBFBF;border-bottom:1.0pt solid
                    #BFBFBF;border-left-style:none;border-right:1.0pt solid
                    #BFBFBF;border-top-style:none;height:15.5pt;padding:0in;width:222.2pt;\" width=\"296\">&nbsp;</td>
            </tr>
        </tbody>
    </table>
</figure>
<p>&nbsp;</p>
dfop02 commented 1 month ago

Hello @berkio3x!

I did a fast look into this and I guess the style on table's rows and columns are not currently supported, take a look on handle_table function. The style from table itself is applied but looks like the rows and columns doesn't.

Feel free to take a look, open a PR and try to add support for this feature. I will add it to my To-do and work on it later.