Open Andrei-Dolgolev opened 2 years ago
About "inputs"
and "value"
:
"inputs"
-> "input_schema"
"value"
-> "inputs"
Algorithm to calculate order of tasks is basically a topological sort of vertices in a directed, acyclic graph.
We have an implementation in our codebase here: https://github.com/bugout-dev/shnorky/blob/a1948fa8299677105cb9e80140ad5c44c2131bfc/flows/specification.go#L125
You should not make a distinction between WithSubcalls
and WithoutSubcalls
. Just resolve everything into levels the same way. The tasks that don't have subcalls automatically go to level 0.
About
"inputs"
and"value"
:
"inputs"
->"input_schema"
"value"
->"inputs"
"outputs"
should also be called "output_schema"
I don't like how complex tasks are nested like this - what if we want to use output of totalSupply
in multiple steps? I think it's best to associate an ID with each task and specify that output of a task can be used as input by any other task (as long as it doesn't cause cycles in the execution graph).
We need to also specify how output of one step can be used as input of next step. totalSupply
output is a single number whose output is used as upper bound on loop (in biologist).
But there could also be cases where a step produces a list of items as an output and we want to map some call over the entire list (e.g. in the future we will do token URIs -> curl).
@Andrei-Dolgolev : I am okay even if we don't have a full dependency semantics to connect inputs to outputs in version 1. Only thing we need to think about carefully is whether our configuration schema can be extended easily to support those kinds of relationships in the future.
Also, another thought: we are really talking about programming here -- map operation over output of previous step, loop over output of previous step, conditional execution based on output of previous step.
Maybe instead of using JSON we should define these pipelines as Python or JS scripts?
This is how Apache Airflow works: https://airflow.apache.org/docs/apache-airflow/stable/tutorial.html
One moment about:
inputs:
"inputs"
-> "input_schema"
"value"
-> "inputs"
outputs:
"output"
-> "output_schema"
"value"
-> "output"
I agree rename value
but "output"
reserved, in genral now for generate from all sets of tasks full set of required contract interfaces it why task schema mainly extended ABI because it was simple dedublicate calls results and requires contracts call like that.
I don't like how complex tasks are nested like this - what if we want to use output of
totalSupply
in multiple steps? I think it's best to associate an ID with each task and specify that output of a task can be used as input by any other task (as long as it doesn't cause cycles in the execution graph).
It exacly how it's working for task you create hash and dict responses is checked object for now and whenever any of tree element pointed to that hash you get already existing response.
Existing problems
1: biologist crawler
Currently biologist have 2 main parts bisedes calculation it interacting with QueryAPI and Blockchain.
If we will have state of view method in database we can remove requirements interacting with Blockchain and Multicall contracts.
2: Opensea of locked assets
Currently for be able get understanding if asset is locked they need call method if unicorn is locked.
If we have assets view method state we can get understanding if asset locked or not it can good api for users.
3: Get stats of NFTS
Usual you need write crawler of metadata url and renew current metadata state.
Suggetion of state crawler version 1 (crawl repeat by interval, 1 blockchain, brownie)
Crawling tasks:
Simple task
get Total supply:
Task have address and all requred data fro generate Contract interface for make call cross Multicall2 and decode output.
Value: Can be:
Because we have subdependency between input and output of different view methods(see case 1):
Complex task
nested structure:
For repeat logic of task from case 1:
Required chain of calls:
totalSupply -> getDNA -> getUnicornBodyParts
Because getDNA need range of current tokens and getUnicornBodyParts required DNA as input.
For resolving that nesting we can use next algorithm:
We parse tasks as on picture:
How store
As suggest @zomglings just put it in same labels table.
Inputs and outputs decoding
In moonworm crawler we have decoder of transactions wich parse output to dictionary.
We need do same for args of inputs and output parameters with provided abi
Currently multicall contractract return us a tuple.
Sometime abi not have names for output and inputs:
but sometime it required for difficult output as on example from task upper: