Don't load files, use raw data

jooola commented 4 years ago

I would like to propose a breaking change (maybe it is one).

In order to simplify the usage of this library (it is a library before being a CLI ?), I propose to change any custom template options such as styles, header or footer, to be imported as raw data in a buffer|string, instead of using file paths. Same could apply to the markdown data.

This should allow more flexibility and remove some uncertainty. Still, puppeteer might require file paths for the footer or header, maybe it is easier to write the files somewhere before giving the path to puppeteer. Since I am only talking on the library side, files can still be specified in the CLI for instance, we just need to load them first using the method one prefer.

I am a bit radical here, maybe we could add some options first with different names, and deprecate the one we would like to rename in the future major release.

Maybe we could discuss of a future API for this library and slowly implement the new feature and finally release a major version with those changes.

@BlueHatbRit Any thoughts on this one ?

elliotblackburn commented 4 years ago

Hey, so actually mdpdf is primarily a CLI tool with the JS api being secondary, or at least it has been in the past. I originally created it during my degree as I needed a way to format some lab reports with code snippets and using Ms Word was killing me!

I still very much see the CLI as the primary method of use for the module. The JS api is still important and with it's wide use in the atom package I'm keep to keep it well oiled and user friendly, but when it comes to thinking about the inputs I really feel like filenames is the right way to go at the moment. When using it by the CLI, being able to pass in file names is much much easier, especially when writing scripts around it.

I'm not entirely clear what the benefits would be of using a buffer/string for the inputs rather than file paths. Puppeteer needs us to have files on disk to load in so I feel like we'd be writing the buffer's straight to disk, then reading them anyway. What benefits do you see for using a buffer/string instead of a file path, are there things we can't do right now with the file path, or that become significnatly easier with a buffer/string approach? If you have some use cases, that might help me to understand a bit more and could justify having some sort of translation layer between the api and cli. 😄

Thanks for your time and energy on these ideas, they're really deeply appreciated.

travs commented 4 years ago

Hi, just adding my 2 cents.

Changing to using raw data would help for projects that consume mdpdf like atom markdown-pdf. When we would pass in an html string to be rendered from the atom settings panel, right now we would have to write out an intermediary temporary file with those contents, then pass the file handle to mpdf.

I think @BlueHatbRit's idea about the CLI UI is correct (passing file handles is much nicer), but we could retain that and just read that file before passing it to the JS api layer.

elliotblackburn commented 4 years ago

Yeah this is true we could do it for the md->html step and allow a string or buffer to be passed through.

Okay I think you've both sort of convinced me for the most part. I think if we did this I'd probably want to support both options through the JS api though as it would make maintaining the CLI easier and it would allow people to continue using it with files as they already are.

If we could have a nice interface which handles both approaches, I'd be happy to work on a PR with someone to do this. I think we'd probably want to expose two different functions for clarity, one which does the traditional way of reading a file, and one which does it via a string/buffer. Of course the file one could call the string/buffer one.

I also think for simplicity I'd probably rather support a string than a buffer for now as it's easier to maintain and debug. I'm not against it in general, but I'd rather have some specific use cases to lay up against it (as we do with the string scenario for atom markdown-pdf).

Would someone like to propose the new interfaces here and we can discuss them before starting on any code? It might save us all a bit of dev time.

adriaandotcom commented 3 years ago

I'm using mdpdf to convert specific documentation pages to PDF. I don't have them locally on my server, but grab them from https://raw.githubusercontent.com/simpleanalytics/docs/master/_docs/10_general/10_what-we-collect.md for example. I do some front matter magic with yaml-front-matter which outputs a string. Would be great to use the string directly with mdpdf without the need to write it to disk.

elliotblackburn / mdpdf

Don't load files, use raw data #86