monitorjbl / excel-streaming-reader

An easy-to-use implementation of a streaming Excel reader using Apache POI
Apache License 2.0
953 stars 344 forks source link

Apply XML parser security features #121

Closed pjfanning closed 5 years ago

pjfanning commented 6 years ago

Copied over some POI code to protect the DOM parser from XML entity expansion attacks. The POI DocumentHelper code creates Namespace Aware parsers and these don't work in current excel-streaming-reader code (the XPath expressions don't have namespace prefixes, etc.).

pjfanning commented 6 years ago

@monitorjbl could you review this, if you get a chance?

monitorjbl commented 6 years ago

There are two outstanding reviews already on this PR

pjfanning commented 6 years ago

I don't see the review items for this PR, would you be able to add the review comments again?

pjfanning commented 6 years ago

@monitorjbl can you reconsider merging this?

monitorjbl commented 5 years ago

@pjfanning I ended up implementing this over the weekend, but I did it differently from your PR. I didn't realize what an entity expansion attack could do or how serious it was when you opened this or I would have jumped on this a lot sooner.