graphprotocol / graph-node

Graph Node indexes data from blockchains such as Ethereum and serves it over GraphQL
https://thegraph.com
Apache License 2.0
2.89k stars 960 forks source link

[Feature] Allow for both pruned and unpruned versions to be live simultaneously #5141

Open trader-payne opened 8 months ago

trader-payne commented 8 months ago

Description

As the title suggests, allow for both pruned and unpruned versions to be live simultaneously.

This means I have Subgraph X which will be living in two different forms - pruned and unpruned. The most important thing needs to be that both should be accessible based on query needs, simultaneously.

If the block/data range queried falls inside the pruned version range, send queries to it. If not, send the historical queries to the unpruned version.

This has to be done with zero input from the infrastructure operator.

Are you aware of any blockers that must be resolved before implementing this feature? If so, which? Link to any relevant GitHub issues.

No response

Some information to help us out

lutter commented 8 months ago

This would be very similar to split entity history but while split entity history tries to achieve this within one subgraph, keeping separate subgraphs would treat the pruned and unpruned version as fairly disconnected. Separate subgraphs will also cause more data duplication than split entity history, e.g., because immutable entities will have to be stored for both as duplicates.

trader-payne commented 8 months ago

Separate subgraphs will also cause more data duplication than split entity history, e.g., because immutable entities will have to be stored for both as duplicates.

Finally giving ZFS dedup its time in the spotlights lol

azf20 commented 7 months ago

@trader-payne can you clarify your goal here? seems that you're not worried about database size, but want to serve fast queries with the pruned version?

trader-payne commented 7 months ago

@azf20 yes, correct. I want to have two (or maybe even more) versions of the same subgraph active at once. As a broad example, all for subgraph "A":

azf20 commented 7 months ago

OK cool. Would a read replica be an alternative approach for the third case?

trader-payne commented 7 months ago

Could be, but I never tried. I don't know if you can have different postgres settings in this case. Maybe @lutter might know. But I know replicas require a lot of things that I normally get rid of, for better indexing/data ingestion speed.

lutter commented 7 months ago

No, a read replica won't work here since the data will be exactly the same as in the main database.

github-actions[bot] commented 3 weeks ago

Looks like this issue has been open for 6 months with no activity. Is it still relevant? If not, please remember to close it.