onizet / html2openxml

Html2OpenXml is a small .Net library that convert simple or advanced HTML to plain OpenXml components. This program has started in 2009, initially to convert user's comments from SharePoint to Word.
MIT License
306 stars 107 forks source link

Cannot insert the OpenXmlElement "newChild" because it is part of a tree. When using a template #69

Closed ElementalLogic closed 3 years ago

ElementalLogic commented 4 years ago

I seem to running into an issue trying to put my HTML into a template.

I get this error:

Cannot insert the OpenXmlElement "newChild" because it is part of a tree.

Strangely, if I use simplistic html all is good, but complex html gives the above error. Similarly, if I use a new document rather than a template then it's all fine.. Tracking this down is proving tricky.

Code below..

 ```
  using (WordprocessingDocument package = WordprocessingDocument.CreateFromTemplate(HttpContext.Current.Server.MapPath("/content/template.docx")))
            {
                MainDocumentPart mainPart = package.MainDocumentPart;
                if (mainPart == null)
                {
                    mainPart = package.AddMainDocumentPart();
                    new Document(new Body()).Save(mainPart);
                }
                  HtmlConverter converter = new HtmlConverter(mainPart);
                   converter.ParseHtml(html);
                mainPart.Document.Save();
            }`

` On a different note, is it possible to insert html chunks rather than a complete document?

Reason I'm asking is that I'm trying to build documents from HTML fragments stored in a database, and want to add different predefined word layouts based on the type of fragment, I'm doing this as HTML with CSS on the HTML version, but trying to use that without the CSS styling will result in less than optimal word output. For example, we have lot's of forms where we want headers/footers and section numbers right aligned. My thinking is that if I could create a word template for that page, output my html into certain areas on that page (using some sort of find/replace) that would work. I could then combine several documents into one and be left with my output..

ElementalLogic commented 4 years ago

Further details:

I've spent some time profiling, and found that if I simplify my html then I can reproduce the error.

The html below causes a problem when "Objective" is defined as a paragraph style in the template. If I change the div to a span it works (but then puts the elements in the

tag into the objectives section). If I change the class name to an invalid style it works (but obviously the section isn't styled correctly).

html = @"<h1>Manual 3</h1> 

    <div class=""Objective"">Objective: To identify conditions and relationships that could give rise to threats to integrity or objectivity including any that could impair independence and to assess whether the firm can accept (re)appointment.</div>
<p>The items marked with an asterisk indicate that there is a potential benefit arising from Provisions Available for Audits of Small Entities (PAASE) &ndash; see the Audit Procedures Manual for further guidance.</p>
<p>Ensure that for new clients, any relevant matters arising from completion of the New Client Checklist are included here (and ensure that the checklist is filed on the permanent file).</p>      ";