philschmid / clipper.js

HTML to Markdown converter and crawler.
Apache License 2.0
488 stars 33 forks source link

Add support for "directory" input instead of single file #5

Closed philschmid closed 10 months ago

philschmid commented 10 months ago

Add support to provide a directory with multiple HTML files instead of a single file for "clipping". The idea would be to read the directory convert files to PDFs and then save them as a single dataset.jsonl