jzillmann / pdf-to-markdown

A PDF to Markdown converter
https://pdf2md.morethan.io
MIT License
1.17k stars 189 forks source link

How can we install it on our localhost ? #4

Closed cavo789 closed 3 years ago

cavo789 commented 7 years ago

Hello Johannes,

Thanks for pdf-to-markdown.

I was trying to install it on my localhost machine but didn't success to make it running. Is it possible ?

I've made a clone of the git repository (git clone https://github.com/jzillmann/pdf-to-markdown) then fired a few npm statements (npm install, npm lint, ...) but, once done, how can I start the interface ?

The src/index.html static page stays with the empty <div id="main"/> (seems logic) but, yeah, how can I install and run locally ?

Thanks a lot in advance !

jzillmann commented 7 years ago

Usually i run npm run watch which build the files continuously and pushes the result to the build directory. Then i just open build/index.html !

cavo789 commented 7 years ago

Thanks for the quick answer.

So git clone then npm install Download all necessary npm packages npm run lint Lint the javascript files npm run test Run tests npm run check Lint & Test npm run watch Continuously build the project open build/index.html

And then, my browser will display index.html with a drop area where I'll be able to upload (on my localhost) a pdf ?

Will try this afternoon.

Thanks !

jzillmann commented 7 years ago

Thats the theory, yes! ;) Should look exactly like the online version does...

cavo789 commented 7 years ago

Sorry Johannes but didn't work. Same result.

What I've done :

I've not fired open build/index.html because open isn't known under Windows (but I presume it's only start a browser).

My problem : the build folder isn't there.

Here is the folder's structure after npm run check : 2017-09-05_07h57_18

Can you tell me how I can get the build folder please ?

Thanks.

jzillmann commented 7 years ago

Hey @cavo789, i think you're missing the last 'build-the-project'-step...

Execute one of those:

HTH

cavo789 commented 7 years ago

One step further ;) thanks

npm run build fired (note : that step isn't mentioned in your documentation https://github.com/jzillmann/pdf-to-markdown#useful-build-commands)

I get an error with npm run release:

c:\pdf-to-markdown>npm run release production

> pdf-to-markdown@0.1.1 release C:\pdf2md\pdf-to-markdown
> npm run lint && rm -rf build/* && NODE_ENV=production webpack -p "production"

> pdf-to-markdown@0.1.1 lint c:\pdf-to-markdown
> eslint src --ext .js --ext .jsx --cache

'NODE_ENV' is not recognized as an internal or external command,
operable program or batch file.

Should I add a parameter when running npm run release ?

Thanks for your patience...

jzillmann commented 7 years ago

So it seems to be a problem with how the script sets an environment variable (see package.json for the list of commands and what they are doing). There seems to be difference between Linux & Windows.

See https://stackoverflow.com/questions/11928013/node-env-is-not-recognized-as-an-internal-or-external-command-operable-comman for explantation.

Easiest seems to me if you install the window-node-env:

npm install -g win-node-env

and try to re-run the command!

cavo789 commented 7 years ago

Last tests.. still unsuccessfull (I think I'll stop to try to install pdf-to-markdown on my localhost, never mind)

So, I've restart, git clone followed by npm install -g win-mode-env but get an error on this last command

c:\pdf-to-markdown>npm install -g win-mode-env
npm ERR! Windows_NT 10.0.14393
npm ERR! argv "C:\\Tools\\nodejs\\node.exe" "C:\\Tools\\nodejs\\node_modules\\npm\\bin\\npm-cli.js" "install" "-g" "win-mode-env"
npm ERR! node v6.10.0
npm ERR! npm  v3.10.10
npm ERR! code E404

npm ERR! 404 Registry returned 404 for GET on https://registry.npmjs.org/win-mode-env
npm ERR! 404
npm ERR! 404  'win-mode-env' is not in the npm registry.
npm ERR! 404 You should bug the author to publish it (or use the name yourself!)
npm ERR! 404
npm ERR! 404 Note that you can also install from a
npm ERR! 404 tarball, folder, http url, or git url.

npm ERR! Please include the following file with any support request:
npm ERR!     c:\pdf-to-markdown\npm-debug.log

And before starting again, I've notice that npm complained about .jsx files : not supported on my Windows computer (and there are a lot of .jsx files in your repo). Just renaming files from .jsx to .js doesn't work of course because there are scripts that refers specifically to .jsx and even if I modify .jsx to .js in your javascript files, npm build still fails.

Bah, tant pis seems to not work under Windows.

Thanks for your help Johannes, very much appreciated.

Have a nice day.

(can be considered as closed)

jzillmann commented 7 years ago

Hey @cavo789 sorry for that!

What was your intend running it on localhost ?

cavo789 commented 7 years ago

Not your fault !!! ;-)

On my localhost because a lot of .pdf that I'm maintaining contains HR private data.

I'm developing my own tool based on markdown files (called Marknotes; see my repository if you're curious). The idea was to convert .pdf to .md so I can manage them through my interface.

By using an online tool like yours, we need to upload the file on the internet which is always a bad idea when the file contains sensitive informations; that's why. I really prefer to self-host such applications.

Have a nice day.

jzillmann commented 7 years ago

@cavo789 Nice!

So even for the hosted version of http://pdf2md.morethan.io its NOT true that you upload files to the internet. It's a client side / javascript-only application, so none of your data leaves your machine.

You can try as follow:

So this is just for verification... once you're trusting the app, you don't need to disconnect anymore! To double check you could use chrome/firefox dev consoles to see whats going in and whats going out... There should be only in, no out!

Alternatively, you can also grab the files from https://github.com/jzillmann/pdf-to-markdown/tree/master/docs and just open the index.html on your machine. Haven't tried, but should work as well!