libp2p / js-libp2p

The JavaScript Implementation of libp2p networking stack.
https://libp2p.github.io/js-libp2p/
Other
2.33k stars 445 forks source link

WebRTCTransport.dial AbortError #2702

Open christroutner opened 1 month ago

christroutner commented 1 month ago

Severity:

Description:

I had filed this previous issue about issues I was having with the @libp2p/webrtc package. That was resolved and the current package versions can be seen here and the code for initializing libp2p can be found here.

I'm now encountering what appears to be a race condition inside the webRTC libraries. The node will run for a while and then randomly will crash with the following error message:

file:///home/safeuser/ipfs-service-provider/node_modules/race-signal/dist/src/index.js:22
        return Promise.reject(new AbortError(opts?.errorMessage, opts?.errorCode, opts?.errorName));
                              ^

AbortError: The operation was aborted
    at raceSignal (file:///home/safeuser/ipfs-bch-wallet-service/node_modules/race-signal/dist/src/index.js:22:31)
    at YamuxStream.closeWrite (file:///home/safeuser/ipfs-bch-wallet-service/node_modules/@libp2p/utils/dist/src/abstract-stream.js:230:19)
    at YamuxStream.close (file:///home/safeuser/ipfs-bch-wallet-service/node_modules/@libp2p/utils/dist/src/abstract-stream.js:189:18)
    at file:///home/safeuser/ipfs-bch-wallet-service/node_modules/libp2p/dist/src/connection/index.js:118:63
    at Array.map (<anonymous>)
    at ConnectionImpl.close (file:///home/safeuser/ipfs-bch-wallet-service/node_modules/libp2p/dist/src/connection/index.js:118:44)
    at initiateConnection (file:///home/safeuser/ipfs-bch-wallet-service/node_modules/@libp2p/webrtc/dist/src/private-to-private/initiate-connection.js:146:34)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async WebRTCTransport.dial (file:///home/safeuser/ipfs-bch-wallet-service/node_modules/@libp2p/webrtc/dist/src/private-to-private/transport.js:93:65)
    at async DefaultTransportManager.dial (file:///home/safeuser/ipfs-bch-wallet-service/node_modules/libp2p/dist/src/transport-manager.js:87:20)
    at async queue.add.peerId.peerId [as fn] (file:///home/safeuser/ipfs-bch-wallet-service/node_modules/libp2p/dist/src/connection-manager/dial-queue.js:168:38)
    at async raceSignal (file:///home/safeuser/ipfs-bch-wallet-service/node_modules/race-signal/dist/src/index.js:28:16)
    at async Job.run (file:///home/safeuser/ipfs-bch-wallet-service/node_modules/@libp2p/utils/dist/src/queue/job.js:55:28) {
  type: 'aborted',
  code: 'ABORT_ERR'
}

Node.js v20.17.0

Steps to reproduce the error:

The error does not occur right away. It will appear at some point within 30 minutes while the node is running. It forces the app to crash and the process manager will restart it. But then the crash will happen again within 30 minutes.

christroutner commented 1 month ago

This might be the same issue I reported in #2462. I'll take a closer look by replacing my node_modules and package-lock.json files and report back here.

However, I don't think that this is the same, as I'm building the application into a docker container with the --no-cache flag. It should be installing the node_modules folder from scratch. ..but the package-lock.json file would be copied from the repository. So maybe that is the issue.

I'll report back on my findings.

christroutner commented 1 month ago

I carefully deleted my node_modules folder and package-lock.json file before installing dependencies and I'm still getting the above error. As far as I can see it does not have anything to do with an unclean install as was claimed in #2462.

The main target that I'm testing is a libp2p node setup as a Circuit Relay server.

Chomtana commented 1 month ago

TURN works for regular internet connections across countries without this error, but it doesn't function properly with restrictive VPNs. This error indicates that WebRTC has failed to establish a connection with the peer.

christroutner commented 1 month ago

I wouldn't mind if webRTC fails to connect, but this error causes the application to crash and exit, and there doesn't seem to be any way to wrap it with try/catch to handle the exception.

cristianmadularu commented 2 weeks ago

This is happening for us as well causing our Node processes to crash.

image
cristianmadularu commented 2 weeks ago

I wouldn't mind if webRTC fails to connect, but this error causes the application to crash and exit, and there doesn't seem to be any way to wrap it with try/catch to handle the exception.

image

@christroutner while this is not a 'solution' (more of a temporary workaround), you might consider an application level handler and consider not allowing the application to crash if that type of exception goes unhandled... Risky approach since there is no guarantee that the app is still in a good state... but... an ugly workaround nevertheless.... until this gets fixed.

christroutner commented 2 weeks ago

I appreciate the tip @cristianmadularu.

I ended up just disabling WebRTC in my application until this issue can be resolved. It would be great to have, but it's not a core requirement.