harshankur / officeParser

A Node.js library to parse text out of any office file. Currently supports docx, pptx, xlsx and odt, odp, ods..
MIT License
123 stars 17 forks source link

(Question) Feature request: maintain order of string and object content #17

Closed kochecc2 closed 10 months ago

kochecc2 commented 11 months ago

I am trying to parse a .docx file where there are dates listed in a table object followed by lots of text for things that were done on those dates, but when I parse it, all the dates show at the bottom of the text output. I think this is because it parses all strings first then moves on to objects. Can you please add a way to maintain the order of these elements as they appear in the .docx?

harshankur commented 10 months ago

Hi, I have updated officeParser to version 4.0.4 where I have addressed this issue. The order of tables has been fixed now. Please note that the order of footnotes and endnotes are still not fixed. Please check the Known Bugs section of the readme.

kochecc2 commented 9 months ago

Thanks very much! I just got around to updating the version and testing this and it works as expected.