Open smuu opened 1 month ago
i think simplest here is to remove https://github.com/celestiaorg/celestia-node/blob/main/libs/utils/address.go#L40 which resolves the IP a single time on start, instead letting clients use the domain as passed in, and relying on the infra of the internet to work
unless there was there good a reason we HAD to resolve IP?
i think simplest here is to remove https://github.com/celestiaorg/celestia-node/blob/main/libs/utils/address.go#L40 which resolves the IP a single time on start, instead letting clients use the domain as passed in, and relying on the infra of the internet to work
unless there was there good a reason we HAD to resolve IP?
If I understand the code correctly, this would only resolve one part of the issue: the connection between the DA BN and the consensus node.
From my understanding, this code is not called when resolving the DNS in a multiaddr
.
One workaround for this issue would be to recreate the connection once it fails after the IP address changes. This way, we don't need to add support to handle the DNS TTL, and the node would request the new IP address from the DNS server.
What's status on this @ramin @smuu ?
Description:
We encountered an issue where changes to the DNS entries of DA nodes in Arabica caused light nodes to fail to sync. Restarting the light nodes resolved the issue, indicating that they resolve DNS once at startup and then use the same IP address indefinitely, ignoring DNS TTL.
Steps to Reproduce:
Suspected Cause: Light nodes resolve DNS entries only once at startup and continue using the same IP address without respecting the TTL. This affects both:
Relevant Code:
multiaddr
DNS resolution: I could not find the relevant code.--core.ip
DNS resolution: https://github.com/celestiaorg/celestia-node/blob/f98d632818d566c7b4fd995b0f4bdc6443a7ed06/nodebuilder/core/config.go#L47Potential Fix:
Repositories Potentially Needing Changes:
celestiaorg/celestia-node
libp2p/go-libp2p
multiformats/go-multiaddr
Impact: Not respecting DNS TTL can lead to connectivity and sync issues, affecting network reliability.
Request for Assistance: