harshankur / officeParser

A Node.js library to parse text out of any office file. Currently supports docx, pptx, xlsx and odt, odp, ods..
MIT License
123 stars 17 forks source link

fix: fixes pdf-parse issue with ESM #31

Closed ChadHelbling closed 3 months ago

ChadHelbling commented 4 months ago

pdf-parse is old and unmaintained, theres a known issue with ESM https://gitlab.com/autokent/pdf-parse/-/issues/24

This adds the recommended workaround. Also fixes a path issue in the tester file.

harshankur commented 3 months ago

Thank you @ChadHelbling for helping out. But I am getting rid of the pdf-parse library as it is not even maintained and I don't know what fixes they may add in the future. I am getting the text content from pdf.js (Mozilla) which pdf-parse relies on as well. Please test the new version and let me know if the ESM problem occurs again.

ChadHelbling commented 3 months ago

The new version works on my end, not seeing any of the previous issues, Thanks!