harshankur / officeParser

A Node.js library to parse text out of any office file. Currently supports docx, pptx, xlsx and odt, odp, ods..
MIT License
123 stars 17 forks source link

(question) feature request: legacy office files #9

Open guillenotfound opened 1 year ago

guillenotfound commented 1 year ago

Would it be possible to also support legacy Office files? If you know how to do it I can further implement it!

harshankur commented 1 year ago

Yeah it is in my TODO for this library but Microsoft has not released any official specification for the legacy office formats (doc, ppt, xls). They did a long ago and then did not update them after they made a few changes to the format. There are a few specifications which are not so accurate. I will try to get my hands on them and give it a look. I will work on this too, I promise.

guillenotfound commented 1 year ago

Maybe this repo helps: https://github.com/nolze/msoffcrypto-tool