Suggestion: Remember previous subgraph data source addresses and blocks for optimization

fubhy commented 5 years ago

Currently, every time you re-deploy a subgraph it starts from Block #0 while in most cases, the contracts of interest only start becoming relevant at 80% of the syncing progress. Previously I suggested that it would be cool to be able to define a "starting block" number. Instead or additionally it might add some value to add this default behavior:

On every deployment, save the block number at which a the provided data source addresses makes their first occurrence. On subsequent deployments, we could compare the subgraph manifest for any "unknown" addresses. If there are new addresses, sync again from Block #0, but if we know the addresses and at what block they become relevant, just start from the earliest of these blocks.

leoyvens commented 5 years ago

This is clever, and I suppose and indexing node could do this optimization, but I'm not too eager to make the syncing behaviour depend on db state when there is a simple way of achieving this with the starting block you suggest.

fubhy commented 5 years ago

Maybe instead of storing it on your side you could print the "first handled block" somewhere in the UI so one can then add it to the subgraph manifest?

schmidsi commented 3 years ago

I think this is resolved since we have a startBlock per data source and grafting. Otherwise, feel free to reopen this issue and clarify.

graphprotocol / support

Suggestion: Remember previous subgraph data source addresses and blocks for optimization #16