EvotecIT / OfficeIMO

Fast and easy to use cross-platform .NET library that creates or modifies Microsoft Word (DocX) and later also Excel (XLSX) files without installing any software. Library is based on Open XML SDK
MIT License
286 stars 50 forks source link

[Question] Does OfficeIMO provide support for HTML fragments? #228

Closed derek-price closed 3 months ago

derek-price commented 3 months ago

I have a list of HTML fragments (from an external source query) that I would like to import into a Word document preserving the formatting. An example is:

var htmlContent = @"<div><b>Imports. </b>An issue was resolved where user import files took 6+ hours to complete. <br> </div><div> </div>";

After many many hours I was able to import that single statement into Word using the DocumentFormat.OpenXML nuget package. It was a miserable experience. It used the AddAlternativeFormatImportPart() method. Trying to do anything else from that package was the worst.

I found OfficeIMO and it's been an absolute dream to use but I really need to get those fragments imported. I've been scanning the examples but don't seeing anything obvious.

Does OfficeIMO support this functionality?

PrzemyslawKlys commented 3 months ago

Only directly as HTML file. I guess it would be possible to add a way to convert HTML directly without being in a file, but so far not available.

public static void Example_EmbedFileHTML(string folderPath, string templateFolder, bool openWord) {
    Console.WriteLine("[*] Creating standard document with embedded HTML file");
    string filePath = System.IO.Path.Combine(folderPath, "EmbeddedFileHTML.docx");
    string htmlFilePath = System.IO.Path.Combine(templateFolder, "SampleFileHTML.html");
    using (WordDocument document = WordDocument.Create(filePath)) {
        Console.WriteLine("Embedded documents in word: " + document.EmbeddedDocuments.Count);
        Console.WriteLine("Embedded documents in Section 0: " + document.Sections[0].EmbeddedDocuments.Count);

        document.AddParagraph("Add HTML document in DOCX");

        document.AddSection();

        Console.WriteLine("Embedded documents in Section 1: " + document.Sections[1].EmbeddedDocuments.Count);

        document.AddEmbeddedDocument(htmlFilePath);

        document.EmbeddedDocuments[0].Save("C:\\TEMP\\EmbeddedFileHTML.html");

        Console.WriteLine("Embedded documents in word: " + document.EmbeddedDocuments.Count);
        Console.WriteLine("Embedded documents in Section 0: " + document.Sections[0].EmbeddedDocuments.Count);
        Console.WriteLine("Embedded documents in Section 1: " + document.Sections[1].EmbeddedDocuments.Count);
        Console.WriteLine("Content type: " + document.EmbeddedDocuments[0].ContentType);

        document.Save(openWord);
    }
}

One would need to add a method that skips the first few lines and goes straight for content.

https://github.com/EvotecIT/OfficeIMO/blob/91a87ac9c23d5b1ee4289e07ee9acab33e1396d4/OfficeIMO.Word/WordEmbeddedDocument.cs#L71-L110

derek-price commented 3 months ago

This is amazing - thanks for the fast reply! Let me give it a go and see how dangerous I can get. 😄

derek-price commented 3 months ago

Just wanted to let you know that your solution worked perfectly. Thanks again!

PrzemyslawKlys commented 3 months ago

I think i will add ability to add fragments directly, I guess you created file and added file which was then created as a fragment. We should have direct ability as well.