shebinleo / pdf2html

pdf2html is a module which helps to convert PDF file to HTML pages using Apache Tika. This module also helps to generate thumbnail image for PDF file using Apache PDFBox.
https://www.npmjs.com/package/pdf2html
Apache License 2.0
154 stars 33 forks source link

Allow to specify binary download URL #40

Closed chengjianhua closed 1 year ago

chengjianhua commented 2 years ago

Hello~Appreciate your great work on this.

I'm using this package, but the postinstall script always fails for me, due to the slow network and firewall limits.

And the https module this package is using doesn't support the HTTP_PROXY environment variable. This is a headache for me...

To resolve the above issue, we might have two solutions:

  1. Download requests honor the HTTP_PROXY, HTTPS_PROXY environment variables. We can use another request library that supports so.

  2. Allow users to specify download URLs. So that users can specify a binary download URL that is served by a mirror that is located close to clients.

For this solution, we can borrow how node-sass support this for users from worldwide:

https://github.com/sass/node-sass/blob/24741b351cb046c4548e77886647cd4c89b48c66/lib/extensions.js#L192-L199

chengjianhua commented 2 years ago

@shebinleo, Could you help take a look at this?

shebinleo commented 1 year ago

@chengjianhua It is mentioned in the readme file how to overcome such issues. You can manually download the dependencies and place it in vendor directory. https://github.com/shebinleo/pdf2html#manually-download-dependencies-files