Closed xJonathanLEI closed 2 years ago
Also related:
The diagram states that the relayer connects to both the readers and the merged block stores. However, the actual code seems to indicate that the relayer doesn't really care about merged blocks:
In fact, it only cares about reader gRPC URLs and one-block storage. Nothing about merged blocks has been requested.
From what I've found so far, it looks like only the front-end process firehose
is dealing with merged blocks, not the relayers, meaning that the diagram is indeed outdated. Is that correct?
Actually, another thing that further confuses me is: why would the relayer need access to the one-block storage, if it:
Thanks a lot!
Just in case it's requested but not used in the actual code, I tried pointing them to an empty folder, but then the progress stops functioning. So it seems like firehose indeed needs access to those. Is the diagram outdated or am I misunderstanding something here? Much thanks in advance!
The diagram is outdated indeed since recently (I would say about a month or so). We refactored a bunch of internals to remove forked blocks from merged-blocks
as well as improving how we bridge "live" segment of chain (one-blocks) and historical segment (merged blocks). Now components that needs to do this bridging accesses both one block store and merged blocks store.
The diagram states that the relayer connects to both the readers and the merged block stores. However, the actual code seems to indicate that the relayer doesn't really care about merged blocks:
Indeed, no merged blocks is accessed in the relayer. The diagram is still correct (but could be clearer) however because one blocks are stored in the object store so it does access (but probably that it should be split in two object stores one for one block and one for merged blocks to make it more precise).
While you are right the live blocks are coming from "reader" node, they are however a "hot" source of blocks meaning that blocks are simply broadcasted on the gRPC connection as they are read by the reader node. If a relayer disconnects for example for 30 seconds, then on resume it would have "miss" a few blocks. Fetching from one blocks in those case will be used to fill the holes the relayer have not seen which make it faster to become ready instead of waiting to receive more live blocks from the reader node(s).
That's very clear. Thanks a lot!
If this data-flow diagram from this page is accurate, the
firehose
component should only need access to relayer(s), who would in turn fetch data from upstream stroage (and live readers):However, it looks like the
firehose
component is requesting storage URLs:https://github.com/streamingfast/firehose-acme/blob/6966e1a3aaf49d2d398686333967299e97bde05b/cmd/fireacme/cli/firehose.go#L108-L109
Just in case it's requested but not used in the actual code, I tried pointing them to an empty folder, but then the progress stops functioning. So it seems like
firehose
indeed needs access to those. Is the diagram outdated or am I misunderstanding something here? Much thanks in advance!