Open andykais opened 5 years ago
for now, this is a back-burner issue. The biggest use case was to reuse login logic for different scrapers.
If I see a pressing reason, I will implement it, until then I will encourage the community to build full independent configs & options.
sample consumable scraper:
example-scraper/
package.json
config.json
options.json
readme.md
The dream here is to let other users maintain scrapers in a community repo, or on their own githubs, and let developers simply install them via npm.
ConfigInit
:yields
Config
:Local
define
defs can override those inside moduledefine
.How to wire this stuff up?
inputs
Create a object in each
ScrapeStep
that came from a module. Object should map full input keys to module's internal keys. The internal keys will be the ones actually used in the handlebar templates. E.g.scrape
Two options:
flow.ts
instance for a module and hook that up to whatever is above/below it.scrapeEach
arrays and reattach the rest of the structure there.stateful values
There may be times when a local/module scraper gets a value that you want for the rest of the run. Most often this will be an auth/access token.
This is essentially global state, whenever
'@community-scrapers/twitter-login'
gives us a value, we update the input value for'accessToken'
, and replace the passed down value with''
organizing dependencies
It is possible to have a separate directory where module scrapers live using
worker_threads
.Your main nodejs process can run something like