metanorma / html2doc

Ruby gem that converts an HTML page/document into a Microsoft Word `.doc` file
Other
30 stars 2 forks source link

Word output in MHT not working (URGENT) #57

Closed ronaldtse closed 3 years ago

ronaldtse commented 3 years ago

I was using Word 16.48 which had this problem:

Screenshot 2021-03-29 at 5 08 30 PM

But I downgraded Word to 16.47 and the problem persists.

So I went searching and experimenting.

This SO post indicated that the reference mechanism should be "Content-ID": https://stackoverflow.com/questions/9321456/images-in-mht-files-from-ms-word-do-not-display-in-email

And this is an excellent MHTML test archive: https://people.dsv.su.se/~jpalme/mimetest/MHTML-test-messages.html

With some experimentation I found that this combination works:

  mso-footnote-separator: url("cid:header.html") fs;
  mso-footnote-continuation-separator: url("cid:header.html") fcs;

...

------=_NextPart_f8ff7cff.51ce.48fd
Content-ID: <header.html>
Content-Disposition: inline; filename="header.html"
Content-Transfer-Encoding: base64
Content-Type: text/html charset="utf-8"

Finally it's fixed! So the solution is not to have file:/// but use "Content-ID" as reference.

@opoudjis could you help implement this fix? Thanks!

ronaldtse commented 3 years ago

From experimentation I am not sure if image attachments continue to work in this pattern. Will check again.