neume-network / core

A socially-scalable music NFT indexer.
https://neume.network
GNU General Public License v3.0
26 stars 11 forks source link

Move pipeline specifiation to strategies #73

Open sirnicolaz opened 2 years ago

sirnicolaz commented 2 years ago

Problem statement

Currently, the definition of the dependencies between different steps of the pipeline is done inside the core package at crawl_path.mjs.

This leads to the issue that contributors that want to write strategies will necessarily have to modify both the strategies package and the core package, whilst, given my understanding of the architecture, core should just represent the crawling boilerplate and orchestration component, unaware of the specifics of the implemented strategies.

Mitigation proposal

Move crawl_path.mjs inside strategies.

Technical credit

The more pipelines will be integrated, the more the complexity of their dependency will increase. Already at this point there is a non-trivial graph defining how the soundxyz + zora pipelines should work: it involves a common father step involving the web3subgraph crawling, which branches out to the crawling of the two platforms, to then merge again in the musicosaccumulator.

Further down the road there might be other platforms that require to be integrated in this same pipeline and with additional in-between transformation steps.

Having a way to clearly define the dependency graph on the strategy repository would help future contributor understanding how their new strategy should be integrated, either by adding it on an existing graph or by creating a whole independent one (such as the get-xkcd crawler).

TimDaub commented 2 years ago

@sirnicolaz pls define this issue