mixmark-io / turndown

🛏 An HTML to Markdown converter written in JavaScript
https://mixmark-io.github.io/turndown
MIT License
8.93k stars 880 forks source link

Can we use this library to read the contents from an HTML file instead of passing the HTML content as text input? Please advise. #464

Closed gituserjava closed 6 months ago

pavelhoral commented 6 months ago

This should work if you read the file yourself and pass the string input to the library.

So the answer to "can we use" is - yes, you can. If you expect the library to read file based on a provided filesystem path or URL, then the answer is - no, the library won't interact with your filesystem or URL location.

martincizek commented 6 months ago

Alternatively, you can use a DOM parser that is capable of stream parsing. This actually can save some resources. You can then pass the DOM to Turndown (instead of string).