Closed ryanchristo closed 1 year ago
Hey team! Please add your planning poker estimate with Zenhub @ryanchristo @wgwz
I was thinking about this some more and I think we can do it in a way where groups-ui can still use the single graphql client connected to the indexer API. The indexer database model was setup so that it can support indexing multiple chains.
I think with some adjustments to the latest code in the indexer, we can having the production indexer, index mainnet and redwood. Important: I want to say it again, we will balloon the size of our database significantly by doing this. But if we are OK with that, then I think this is a good path to go.
We will need to adjust this PollingProcess
classes run
method to accept arguments for rpc and api endpoints:
https://github.com/regen-network/indexer/blob/main/utils.py#L66
Which in turn would get passed to: https://github.com/regen-network/indexer/blob/main/utils.py#L73C23-L73C34
Currently the indexer only instantiates one PollingProcess
per indexing task.
I.e.
Instead of having one PollingProcess
per indexing task, we can have multiple.
There would be one PollingProcess
per indexing process per chain.
We could use a new database table that configures this:
Table "public.config_chain"
Column | Type | Collation | Nullable | Default
------------+--------------------------+-----------+----------+--------------------
name | text | | |
rpc_url | text | | |
api_url | text | | |
Each row would represent a chain that we want the indexer to run against.
Then at each of the call sites for PollingProcess
we could instead have a loop that instantiates one per chain.
The reason this could work nicely is because the indexer data model uses a unique id called chain_num
all throughout.
So for example, in production, we want to be able to toggle between redwood and mainnet, and as a result see the historical proposals in each network.
As shown above, the allProposals
query has the chainNum
field available.
So we can just a write query that says, "give me all proposals where the chain number is redwoods chain number".
If it's not already indexed, we'll need to add an index on the proposals table's chain_num column.
This means that we wouldn't have to instantiate a whole new graphql client each time we toggle the chain in the groups-ui.
And also we would not strictly need to keep the staging version of the indexer database up and running for production groups-ui to work correctly.
It's a single database this way.
And it also scales better if we need to add other chains, since we would basically just need to add a row in the public.config_chain
table or whatever configuration method we choose.
Thoughts?
I agree with running a process for redwood and mainnet using the same indexer deployment and database. It was designed for this purpose and we can prune indexed redwood state as needed to avoid unnecessary storage expenses.
I guess we would use the same URL in production for both mainnet and redwood and therefore may only need one variable for the two but maybe we should consider adding REGEN
to the variable name to make it more clear that this is the endpoint for regen mainnet and redwood testnet, which would leave room for other endpoints if and when another network is added.
I think we still want to prepare for multiple graphql endpoints in the case where a network is added and we are not hosting the data but someone from that network is (and they use our indexer to run a process specifically for their chain, making it available within our deployment of the groups ui).
From the groups-ui perspective, we just need to make sure we maintain support multiple networks, which is how the groups-ui was designed, and whether we use the same indexer deployment and database for both regen mainnet and redwood testnet is something we can further discuss but maybe to start we can continue towards updating the configuration.
I'll take an initial crack at what I was originally thinking and then maybe we can further discuss.
Rough idea is there: https://github.com/regen-network/groups-ui/pull/139
We could consolidate the regen environment variable (or maybe better to leave separate so we have options when testing) but the general idea is there. This way we continue leveraging chain information specific to each chain.
We might even want to consider removing the environment variables if they are the same for each environment (i.e. local, staging, and production). In some cases you may still want to swap them out though so maybe we could have defaults in place in the code so we don't have to worry about setting them correctly in staging and production. Thinking out loud a bit.
@wgwz whatever we decide here, let's open an issue in the indexer to discuss the adjustments you mentioned above. I think this is the right direction for managing indexing for regen mainnet and redwood testnet.
@ryanchristo the approach in your PR looks good to me so far, it conceptually it makes sense how you're going about it and why. no strong opinion about having the environment variables or not since we'll effectively always have these endpoints configured in code with this approach (unless i'm misunderstanding). open to seeing those cleaned up however you think it makes sense.
Ref: https://github.com/regen-network/groups-ui/pull/112#issuecomment-1662849871
As a followup to #112, we should improve our QA testing setup and deployment pipeline to make indexed proposals on both mainnet and redwood available in both staging and production environments. Currently we have a single environment variable for setting the graphql endpoint that we set separate in staging and production environments.
The groups ui was designed to switch between networks and we should make sure we have full support for indexed proposals (endpoints in place at least) within the groups ui. If we add networks and those networks require a different endpoint (someone else is hosting the indexed proposals), the groups ui should be configurable to support an endpoint specific to the network.