alephium / explorer-backend

The explorer backend for Alephium protocol
https://explorer.alephium.org
GNU Lesser General Public License v3.0
7 stars 14 forks source link

Issue syncing past 12th June - incorrect blockflowhost #572

Open TheTrunk opened 1 month ago

TheTrunk commented 1 month ago

While running under docker we configure env var "BLOCKFLOW_HOST=fluxdaemon_alphexplorer"

The log of backend confirms that on init with

############################################ Explorer-backend started with ReadWrite mode Network id: 0 Database: jdbc:postgresql://fluxpostgres_alphexplorer:5432/explorer Syncing blocks with node at: http://fluxdaemon_alphexplorer:12973 Access the API at: 0.0.0.0:9090/docs Activate debug logs with:

  • The API: curl -X 'PUT' '0.0.0.0:9090/utils/update-global-loglevel' -d 'DEBUG'
  • An environment variable: EXPLORER_LOG_LEVEL=DEBUG

############################################

However we fail as it is reaching out to 127.0.0.1

2024-10-10 00:17:33,898 [ext-global-20] DEBUG o.a.e.util.Scheduler - Scheduler 'SYNC_SERVICES', Task 'MempoolSyncService': Scheduled with delay 5.seconds 2024-10-10 00:17:34,457 [ext-global-18] DEBUG o.a.e.s.BlockFlowSyncService$ - Downloading ghost uncles ArraySeq(BlockHash(hex"00000000000089f1332ea41f0930c83dc7b34d612d01bb83d0cd6f9b745d3d1b")) 2024-10-10 00:17:34,468 [ext-global-19] ERROR o.a.e.util.Scheduler - Scheduler 'SYNC_SERVICES', Task 'BlockFlowSyncService': Failed executing task org.alephium.explorer.error.ExplorerError$UnreachableNode: Could not reach node. at org.alephium.explorer.service.BlockFlowClient$Impl$$anonfun$_send$2.applyOrElse(BlockFlowClient.scala:165) at org.alephium.explorer.service.BlockFlowClient$Impl$$anonfun$_send$2.applyOrElse(BlockFlowClient.scala:165) at scala.concurrent.impl.Promise$Transformation.run(Promise.scala:490) at java.base/java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(Unknown Source) at java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source) at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown Source) at java.base/java.util.concurrent.ForkJoinPool.scan(Unknown Source) at java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source) at java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source) Caused by: sttp.client3.SttpClientException$ConnectException: Exception when sending request: GET http://127.0.0.1:12973/blockflow/blocks-with-events/00000000000089f1332ea41f0930c83dc7b34d612d01bb83d0cd6f9b745d3d1b at sttp.client3.SttpClientExceptionExtensions.defaultExceptionToSttpClientException(SttpClientExceptionExtensions.scala:13) at sttp.client3.SttpClientExceptionExtensions.defaultExceptionToSttpClientException$(SttpClientExceptionExtensions.scala:11) at sttp.client3.SttpClientException$.defaultExceptionToSttpClientException(SttpClientException.scala:24) at sttp.client3.asynchttpclient.AsyncHttpClientBackend.$anonfun$adjustExceptions$1(AsyncHttpClientBackend.scala:227) at sttp.client3.SttpClientException$$anonfun$adjustExceptions$1.applyOrElse(SttpClientException.scala:35) at sttp.client3.SttpClientException$$anonfun$adjustExceptions$1.applyOrElse(SttpClientException.scala:34) ... 7 common frames omitted Caused by: java.net.ConnectException: Connection refused: /127.0.0.1:12973

Doing curl to fluxdaemon_alphexplorer from our backend container works well, doing it to 127.0.0.1 fails as expected. There most likely is some bug of not respecting configured env variable and falling back to 127.0.0.1 as default.

tdroxler commented 1 month ago

Do you have the BLOCKFLOW_DIRECT_CLIQUE_ACCESS value set to true? if yes, can you try to use false as the default value. This value tells your explorer backend if you should use the provided blockflow uri, or the clique's uris provided by the node.

If it doesn't work, can you share your docker and/or docker-compose config?

TheTrunk commented 1 month ago

Do you have the BLOCKFLOW_DIRECT_CLIQUE_ACCESS value set to true? if yes, can you try to use false as the default value. This value tells your explorer backend if you should use the provided blockflow uri, or the clique's uris provided by the node.

If it doesn't work, can you share your docker and/or docker-compose config?

Hi we do use following ENV variables

                "DB_HOST=fluxpostgres_alphexplorer",
                "DB_PORT=5432",
                "DB_USER=postgres",
                "DB_PASSWORD=postgres",
                "DB_NAME=explorer",
                "EXPLORER_PORT=9090",
                "BLOCKFLOW_HOST=fluxdaemon_alphexplorer",
                "BLOCKFLOW_PORT=12973",
                "EXPLORER_HOST=0.0.0.0",
                "EXPLORER_LOG_LEVEL=DEBUG",
                "PATH=/opt/java/openjdk/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
                "JAVA_HOME=/opt/java/openjdk",
                "LANG=en_US.UTF-8",
                "LANGUAGE=en_US:en",
                "LC_ALL=en_US.UTF-8",
                "JAVA_VERSION=jdk-17.0.12+7",
                "JAVA_NET_OPTS=-Djava.net.preferIPv4Stack=true",
                "JAVA_MEM_OPTS=",
                "JAVA_GC_OPTS=",
                "JAVA_EXTRA_OPTS="

I do not think there is BLOCKFLOW_DIRECT_CLIQUE_ACCESS set anywhere and so it should be the default value. We will add BLOCKFLOW_DIRECT_CLIQUE_ACCESS=false and report back

This is the application configuration settings: https://api.runonflux.io/apps/appspecifications/alphexplorer

TheTrunk commented 1 month ago

The env variable BLOCKFLOW_DIRECT_CLIQUE_ACCESS=false did not resolve the issue. It is still reaching out over 127.0.0.1 Exception when sending request: GET http://127.0.0.1:12973/blockflow/blocks-with-events/00000000000089f1332ea41f0930c83dc7b34d612d01bb83d0cd6f9b745d3d1b

h0ngcha0 commented 1 month ago

Hey, could you please try to set the NODE_ADDRESS for the full node configs below to an address that explorer backend can access? It is probably set to 127.0.0.1 right now.

alephium.network.internal-address = $NODE_ADDRESS
alephium.network.coordinator-address = $NODE_ADDRESS
tdroxler commented 1 month ago

you can maybe take some inspiration from our alephium-stack docker-compose as well as the corresponding user.conf

TheTrunk commented 1 month ago

Tried from scratch and observed the same issue - synced till 12th June and then stopped with the error.

However managed get it to sync using your suggestion alephium.network.internal-address = "daemon:39973" alephium.network.coordinator-address = "daemon:39973"

Needless to say I have

                    "DNSNames": [
                        "fluxdaemon_alphexplorer",
                        "10b19f05ea2b",
                        "daemon"
                    ]

and tried before using fluxdaemon_alphexplorer:39973 - that is resulting in an error. Only using daemon:39973 works properly

tdroxler commented 1 month ago

that's too strange that suddenly it stop to sync at the exact same date. I suspect that the error is actually not a node not reachable. Can you actually get that block with curl: http://127.0.0.1:12973/blockflow/blocks-with-events/00000000000089f1332ea41f0930c83dc7b34d612d01bb83d0cd6f9b745d3d1b? Because from the logs, it's an uncle blocks, maybe your node don't have them. What version of the node and the explorer-backend are you running?

TheTrunk commented 1 month ago

that's too strange that suddenly it stop to sync at the exact same date. I suspect that the error is actually not a node not reachable. Can you actually get that block with curl: http://127.0.0.1:12973/blockflow/blocks-with-events/00000000000089f1332ea41f0930c83dc7b34d612d01bb83d0cd6f9b745d3d1b? Because from the logs, it's an uncle blocks, maybe your node don't have them. What version of the node and the explorer-backend are you running?

Yes always stops on the 12th of June, all the nodes.

No I can't fetch it over 127.0.0.1:12973. That is the thing. It should not be reaching out over 127.0.0.1:12973 but should be reaching out over what is configured in the blockflow fluxdaemon_alphexplorer:12973. Thill the 12th of June, it sync over fluxdaemon_alphexplorer:12973 but then something forces it to sync over 127.0.0.1:12973 instead.