alephdata / memorious

Lightweight web scraping toolkit for documents and structured data.
https://docs.alephdata.org/developers/memorious
MIT License
311 stars 59 forks source link

Run a crawler from a yaml file #161

Closed sunu closed 3 years ago

sunu commented 3 years ago

Run a crawler from a yaml file and optionally a source file directory. CLI syntax looks like memorious run-file test.yaml --src foo/test/. run-file also takes optional arguments used by run like --threads, --flush etc.