forbole / egldjuno

Creative Commons Zero v1.0 Universal
0 stars 0 forks source link

Query blocks at Elasticscarch without multithread workers? #1

Open HarleyAppleChoi opened 2 years ago

HarleyAppleChoi commented 2 years ago

Feature description

there is an Elasticsearch database already built by Elrond team. In order to parse all the blocks from Elrond, we actually query from Elasticsearch which the block is already indexed. However, with "scroll" in Elasticsearch you can only query data from the latest to the oldest by scrolling. Also, only one worker can work on that because when I query with scroll, scroll_id for the next scroll is obtained only when the previous page is fetched. It is not possible to query scrolls from multiple workers.

Implementation proposal

One of the solutions is we have a single worker that fetches all the history blocks by scrolling, and another worker fetches the latest blocks at the same time. And there are stand-alone workers on fetching on accounts/ transactions etc. So the Worker would be useless and all this do is multiple threading fetching elasticsearch db into the juno db

Maybe there are better solutions but I don't know what is that atm

HarleyAppleChoi commented 2 years ago

We can't have "skip enqueue blocks" using this implementation and may need extra time parsing the blocks that are already parsed.

HarleyAppleChoi commented 2 years ago

@kwunyeung What do you think?