xceedsoftware / DocX

Fast and easy to use .NET library that creates or modifies Microsoft Word files without installing Word.
Other
1.78k stars 475 forks source link

Changing the document leads to the document preview breaking in specialized software. #482

Open SheinEA opened 3 months ago

SheinEA commented 3 months ago

I noticed that when working with a document, replacing text, adding paragraphs, etc., the document preview in specialized software does not work. If you open the document in MS Office, everything is fine. What could be the reason for this behavior?

SheinEA commented 2 months ago

Hi, I did an analysis. You do not minify xml as in the original. This increases the file size, and in some solutions the preview breaks. It would be good if after working with the file its xml was restored to its original state.

XceedBoucherS commented 2 months ago

Hi, Could you explain more in the minifying process ? I mean, when we load a document, we read it's xml file and when we save, we write in the xml files, but there shouln't be anything increasing the xml file size other than new text.

Thank you

SheinEA commented 1 month ago

Hi!

  1. Create a new .docx document with the content - Hello!
  2. Make a copy of it for later comparison.
  3. Save one of these files using your library.
  4. Unzip the documents and find the files that have changed, for example the document.xml
  5. Compare the original version with the modified one, you will see something like this. Left is the original, right is after the changes: image To fix this, we had to get overhead: image
XceedBoucherS commented 1 month ago

Hi,

From what I understand, the xml (in document.xml for example) do not look the same before and after using our library. The difference is the indentation added with our library, but the content is the same and using MS Office to open it is working great. The problem only occurs when using a third party software. You have found a workaround to remove the indentation by setting a Office2013 Format.

When we save the document, we are using the : "public void Save( TextWriter textWriter, SaveOptions options )" method from XDocument, where the options will be "SaveOptions.None", which means: "Format (indent) the XML while serializing."

This has never been an issue for other customers, is easier to read and is a problem only for third party softwares (which need an old File Format ?). Since you have a workaround, we will take note of this issue and evaluate if a change would be beneficial for the community, but won't change anything at this point.

Thank you