plutext / docx4j

JAXB-based Java library for Word docx, Powerpoint pptx, and Excel xlsx files
https://www.docx4java.org/
2.12k stars 1.2k forks source link

Allow docx4j to work together with Content Security Policy headers #545

Open torbenriis opened 1 year ago

torbenriis commented 1 year ago

Hi

While introducing Content Security Policy (CSP) we realized our usage of docx conversion led to inline script tags during conversion to html. The main idea for CSP is to prevent inline scripting in order to prevent XSS attacks. We have been studying the code and it seams not possible to prevent the “toogleDiv” function being generated.

The HTMLExporterVisitorDelegate always calls HtmlScriptHelper.createDefaultScript(…)

As we see it, there would be 3 options

  1. Introduce an option to specify a nonce to be used while generating the script tag.
  2. Have a constant in the project containing a calculated hash of the function. With this, we could just refer the constant while setting the header without any concerns whether this function has changed.
  3. Or having the option to prevent this script to be generated and serve it by our self from an include.

How do you see this, or would there be options to influence the script generation I just didn’t identify?

The inline stylesheet generation is actually also an issue, but it seems like a lot of people still chose to allow 'unsafe-inline' for style-src. But in reality, this should actually also be possible to control.

https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Security-Policy https://content-security-policy.com/nonce https://content-security-policy.com/hash

plutext commented 1 year ago

HTMLSettings allows you to set both SCRIPT_ELEMENT_HANDLER and STYLE_ELEMENT_HANDLER

The interface for the script element handler is at https://github.com/plutext/docx4j/blob/VERSION_8_3_9/docx4j-core/src/main/java/org/docx4j/convert/out/ConversionHTMLScriptElementHandler.java

The default implementation is at https://github.com/plutext/docx4j/blob/VERSION_8_3_9/docx4j-core/src/main/java/org/docx4j/convert/out/html/HTMLConversionContext.java#L75

Following that example, you could create your own implementation which simply drops it.

Add it to your htmlSettings with htmlSettings.setScriptElementHandler(yourScriptElementHandler)

Similarly for the stylesheet, you could make it external; there is htmlSettings.setStyleElementHandler

Hope this approach works for you?