onizet / html2openxml

Html2OpenXml is a small .Net library that convert simple or advanced HTML to plain OpenXml components. This program has started in 2009, initially to convert user's comments from SharePoint to Word.
MIT License
306 stars 107 forks source link

Styles distorted while parsing #86

Closed KhurramShehzadd closed 2 weeks ago

KhurramShehzadd commented 3 years ago

I'm using a TinyMCE editor to get the html of content then pass this html to my api to generate the word document. The html received from TinyMCE works great on any online html editor. But I use html parser converter.ParseHtml(html); and word document is generated. Styles are not there.

Below is my actual html:

<p dir="rtl" style="margin: 0in 0in 8pt; font-size: 11pt; font-family: Calibri, sans-serif;"><span style="font-size: 16.0pt; font-family: 'Sakkal Majalla';">تجربه إضافة نص تجربه إضافة نص تجربه إضافة نص تجربه إضافة نص تجربه إضافة نص تجربه إضافة نص تجربه إضافة نص تجربه إضافة نص تجربه إضافة نص تجربه إضافة نص تجربه إضافة نص تجربه إضافة نص تجربه إضافة نص تجربه إضافة نص تجربه إضافة نص تجربه إضافة نص تجربه إضافة نص تجربه إضافة نص تجربه إضافة نص تجربه إضافة نص تجربه إضافة نص تجربه إضافة نص تجربه إضافة نص تجربه إضافة نص.</span></p>
<ol dir="rtl" style="margin-bottom: 0in;">
<li><span style="font-size: 16.0pt; font-family: 'Sakkal Majalla';">تجربه </span></li>
<li><span style="font-size: 16.0pt; font-family: 'Sakkal Majalla';">تجربه </span></li>
<li><span style="font-size: 16.0pt; font-family: 'Sakkal Majalla';">تجربه</span></li>
</ol>
<div dir="rtl" align="right">
<table class="MsoTable15Grid4Accent3" dir="rtl" style="border-collapse: collapse; border: none;" border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr style="height: 30px;">
<td style="width: 155.8pt; border-top: 1pt solid #a5a5a5; border-right: 1pt solid #a5a5a5; border-bottom: 1pt solid #a5a5a5; border-image: initial; border-left: none; background: #a5a5a5; padding: 0in 5.4pt; height: 30px;" valign="top" width="208">
<p dir="RTL" style="margin: 0in 0in 8pt; font-size: 11pt; font-family: Calibri, sans-serif;"><strong><span style="font-size: 16.0pt; font-family: 'Sakkal Majalla'; color: white;">تجربه</span></strong></p>
</td>
<td style="width: 155.85pt; border-top: 1pt solid #a5a5a5; border-left: none; border-bottom: 1pt solid #a5a5a5; border-right: none; background: #a5a5a5; padding: 0in 5.4pt; height: 30px;" valign="top" width="208">
<p dir="RTL" style="margin: 0in 0in 8pt; font-size: 11pt; font-family: Calibri, sans-serif;"><strong><span style="font-size: 16.0pt; font-family: 'Sakkal Majalla'; color: white;">تجربه</span></strong></p>
</td>
<td style="width: 155.85pt; border-top: 1pt solid #a5a5a5; border-bottom: 1pt solid #a5a5a5; border-left: 1pt solid #a5a5a5; border-image: initial; border-right: none; background: #a5a5a5; padding: 0in 5.4pt; height: 30px;" valign="top" width="208">
<p dir="RTL" style="margin: 0in 0in 8pt; font-size: 11pt; font-family: Calibri, sans-serif;"><strong><span style="font-size: 16.0pt; font-family: 'Sakkal Majalla'; color: white;">تجربه</span></strong></p>
</td>
</tr>
<tr style="height: 30px;">
<td style="width: 155.8pt; border-right: 1pt solid #c9c9c9; border-bottom: 1pt solid #c9c9c9; border-left: 1pt solid #c9c9c9; border-image: initial; border-top: none; background: #ededed; padding: 0in 5.4pt; height: 30px;" valign="top" width="208">
<p dir="RTL" style="margin: 0in 0in 8pt; font-size: 11pt; font-family: Calibri, sans-serif;"><strong><span style="font-size: 16.0pt; font-family: 'Sakkal Majalla'; color: black;">1</span></strong></p>
</td>
<td style="width: 155.85pt; border-top: none; border-left: 1pt solid #c9c9c9; border-bottom: 1pt solid #c9c9c9; border-right: none; background: #ededed; padding: 0in 5.4pt; height: 30px;" valign="top" width="208">
<p dir="RTL" style="margin: 0in 0in 8pt; font-size: 11pt; font-family: Calibri, sans-serif;"><span style="font-size: 16.0pt; font-family: 'Sakkal Majalla'; color: black;">2</span></p>
</td>
<td style="width: 155.85pt; border-top: none; border-left: 1pt solid #c9c9c9; border-bottom: 1pt solid #c9c9c9; border-right: none; background: #ededed; padding: 0in 5.4pt; height: 30px;" valign="top" width="208">
<p dir="RTL" style="margin: 0in 0in 8pt; font-size: 11pt; font-family: Calibri, sans-serif;"><span style="font-size: 16.0pt; font-family: 'Sakkal Majalla'; color: black;">3</span></p>
</td>
</tr>
</tbody>
</table>
</div>
<p dir="rtl" style="margin: 0in 0in 8pt; font-size: 11pt; font-family: Calibri, sans-serif;"><span style="font-size: 16.0pt; font-family: 'Sakkal Majalla';">&nbsp;</span></p>

Expected result in word:

image

Actual result in word:

image

KhurramShehzadd commented 3 years ago

I noticed that all direction attributes converted from dir="rtl" to dir="ltr" after parsing. Any help will be highly appreciated.

KhurramShehzadd commented 3 years ago

@onizet, Can you please suggest on this. What could be the workaround ?

onizet commented 3 years ago

Hi, after a long debugging session, I found out that the library doesn't support yet the background style. Instead, I suggest you to use either bgcolor or background-color.

onizet commented 1 month ago

@KhurramShehzadd hi, I think I cross-reply, sorry about that. Can you provide me a docx file with the expected output? It's hard for me to troubleshoot RTL documents as I don't know how to turn my Word to RTL mode. Thank you