cardano-foundation / cardano-graphql

GraphQL API for Cardano
Apache License 2.0
259 stars 103 forks source link

Take too long time to initialize cardano-graphql-server #831

Open ihooni opened 1 year ago

ihooni commented 1 year ago

Summary

This issue is related to the cardano-graphql-server that connects to the Cardano mainnet.

It takes too long for the GraphQL server to actually open.

In the below logs, AdaPotsToCalculateSupply takes 8 min before graphql server is opened.

2023-03-20T22:55:57+09:00 {"name":"cardano-graphql","pid":1,"level":30,"module":"CardanoNodeClient","msg":"Initializing. This can take a few minutes...","time":"2023-03-20T13:55:57.316Z","v":0}
2023-03-20T22:55:57+09:00 {"name":"cardano-graphql","pid":1,"level":30,"module":"CardanoNodeClient","msg":"Initialized","time":"2023-03-20T13:55:57.349Z","v":0}
2023-03-20T22:55:57+09:00 {"name":"cardano-graphql","pid":1,"level":30,"module":"Server","msg":"Initializing","time":"2023-03-20T13:55:57.349Z","v":0}
2023-03-20T22:55:57+09:00 {"name":"cardano-graphql","pid":1,"level":30,"module":"HasuraClient","msg":"Initializing","time":"2023-03-20T13:55:57.614Z","v":0}
2023-03-20T22:55:57+09:00 {"name":"cardano-graphql","pid":1,"level":20,"module":"HasuraClient","msg":"graphql-engine setup","time":"2023-03-20T13:55:57.870Z","v":0}
2023-03-20T23:03:48+09:00 {"name":"cardano-graphql","pid":1,"level":20,"module":"DataFetcher","instance":"AdaPotsToCalculateSupply","value":{"circulating":"34715319023093851","reserves":"9408537858242252"},"msg":"Initial value fetched","time":"2023-03-20T14:03:48.585Z","v":0}
2023-03-20T23:03:48+09:00 {"name":"cardano-graphql","pid":1,"level":20,"module":"HasuraClient","msg":"Data fetchers initialized","time":"2023-03-20T14:03:48.585Z","v":0}
2023-03-20T23:03:48+09:00 {"name":"cardano-graphql","pid":1,"level":30,"module":"HasuraClient","msg":"Initialized","time":"2023-03-20T14:03:48.585Z","v":0}
2023-03-20T23:03:48+09:00 {"name":"cardano-graphql","pid":1,"level":30,"module":"Server","msg":"GraphQL HTTP server at http://0.0.0.0:3100/ started","time":"2023-03-20T14:03:48.586Z","v":0}

Based on the logs and the code, it seems that most of the delay is happening in the following section: https://github.com/input-output-hk/cardano-graphql/blob/16f46c7a5a13f7b1786a77723fc04a9073582c63/packages/api-cardano-db-hasura/src/HasuraClient.ts#L128

Actually, the above section call to the following gql query: https://github.com/input-output-hk/cardano-graphql/blob/16f46c7a5a13f7b1786a77723fc04a9073582c63/packages/api-cardano-db-hasura/src/HasuraClient.ts#L62-L91

I think this query (aggregates all reward, utxo, and withdrawal information from the mainnet) is a very high load query. In fact, my Postgres DB IOPS jumped up to 40,000 when the hasuraClient was initialized.

How about simplify the query when initializing to reduce the time for the gql server is opened? Or is there a specfic reason for executing this query?

Thanks.

Steps to reproduce the bug

  1. Prepare postgres server
  2. Execute db-sync for cardano mainnet
  3. Wait until db-sync catch network
  4. Execute ogmios
  5. Execute graphql-background
  6. Execute graphql-server

Actual Result

Elapsed 8min for initializing graphql server.

Expected Result

Graphql server is opened in tens of seconds.

Environment

cardano-node: 1.35.5 ogmios: 5.5.8 inputoutput/cardano-graphql-hasura: 8.0.0 inputoutput/cardano-graphql-background: 8.0.0 inputoutput/cardano-graphql-server: 8.0.0

postgres: 12.12

Platform

Platform version

No response

Runtime

Runtime version

No response

xray-robot commented 1 year ago

Initializing token data from token-registry will soon take almost two days. I think we should git-checkout the entire token-registry repository on the init, process all the tokens, and then run the worker.

Min time: 8_000_000 tokens / 500 batch_size * 5 sec / 60 / 60 / 24 = ~0.93 days https://github.com/input-output-hk/cardano-graphql/blob/d8a01935554ccafd660c7abab8c8f9de35a7fd12/packages/api-cardano-db-hasura/src/Worker.ts#L90-L101

Also, checking the initialization is an excessive load on the database. https://github.com/input-output-hk/cardano-graphql/blob/d8a01935554ccafd660c7abab8c8f9de35a7fd12/packages/server/src/Server.ts#L121-L158

And in general on db-sync some unreasonably high load from cardano-graphql stack even after initialization. Hopefully these issues will somehow be solved soon. 🤔