Even for SSL/TLS encrypted connections, an attacker can mount relatively inexpensive MITM attacks that are extremely hard to detect and afford the ability to intelligently tamper with messages exchanged between the two endpoints.
This risk can be mitigated if at least one side has a certificate from a trusted root, but having a certificate is not always possible (or even desirable) in a decentralized and permissionless network.
Ultimately, our goal is to ensure that two endpoints A and B know that they are talking directly to each other over a single end-to-end SSL/TLS session instead of two SSL/TLS sessions, with an attacker acting as a proxy.
We can prevent this attack by leveraging the fact that the two servers both have secp256k1 keypairs and use that to strongly bind the SSL/TLS session to the identities of each of the two servers.
To do this we reach into the SSL/TLS session, and extract the finished messages for the local and remote endpoints, combine them to generate a unique "fingerprint" that is the same for both SSL/TLS endpoints. That fingerprint is then signed by each server using its secp256k1 identity and the signature is transferred over the SSL/TLS encrypted link.
If an attacker establishes two separate SSL sessions, the fingerprints of the two sessions will be different, and the attacker will not be able to sign the fingerprint of his session with B with A's private key, or the fingerprint of his session with A with B's private key, and both A and B will know that an active MITM attack is in progress and will close their connections.
If an attacker simply proxies the raw bytes, he is unable to decrypt the data being transferred between A and B and cannot intelligently tamper with the message stream between the two endpoints.
Per @nbougalis: