mwilliamson / python-mammoth

Convert Word documents (.docx files) to HTML
BSD 2-Clause "Simplified" License
811 stars 121 forks source link

Support for ruby text #76

Open elenderg opened 5 years ago

elenderg commented 5 years ago

Would you be able to make mammoth recognize ruby text?

Warnings/errors from Mammoth conversion:
Message(type='warning', message='An unrecognised element was ignored: w:ruby')
mwilliamson commented 5 years ago

Could you provide a minimal example document, along with the HTML that's currently produced and the expected HTML?

elenderg commented 5 years ago

ruby.docx

image

desired output would be like this:

<html>

<head>
<meta http-equiv=Content-Type content="text/html; charset=windows-1252">

</head>

<body>

<div> 

<p>
    <ruby style='ruby-align:distribute-space'>
        日本語
            <rt>
                にほんご
            </rt>
    </ruby>
(1-2-1 Alignment (distribute space); Yu Mincho, 24pt)
</p>

<p>
    <ruby>
        日
            <rt>
                に
            </rt>
    </ruby>
    <ruby>
        本
            <rt>
                ほん
            </rt>
    </ruby>
    <ruby>
        語
            <rt >
                ご
            </rt>
    </ruby>
 (mono/individual Alignment)
</p>

<p>
    <ruby style='ruby-align:left'>
        日本語
            <rt>
                にほんご
            </rt>
    </ruby>
(Left Alignment)
</p>

<p>
    <ruby style='ruby-align:center'>
        日本語
            <rt>
                にほんご
            </rt>
    </ruby>
(Center Alignment)
</p>

<p>
    <ruby style='ruby-align:right'>
        日本語
            <rt>
                にほんご
            </rt>
    </ruby>
(Center Alignment)
</p>

</div>

</body>

</html>