BiagioFesta / wtransport

Async-friendly WebTransport implementation in Rust
Apache License 2.0
346 stars 19 forks source link

Bug report: `Connection::remote_address` returns IPv6 addresses despite using IPv4 #165

Closed finnbear closed 2 months ago

finnbear commented 2 months ago

I used with_bind_default(443) when constructing my server Endpoint though all my current clients connect over IPv4. However, Connection::remote_address returns IPv6 addresses consisting of mapped IPv4 addresses.

Expected behavior

Either:

  1. Make Connection::remote_address return SocketAddr::V4(SocketAddrV4) for IPv4 clients
  2. Make Connection::remote_address return SocketAddrV6 for all clients
  3. Document this issue

Workaround

fn addr(connection: &Connection) -> SocketAddr {
    let addr = connection.remote_address();
    if let SocketAddr::V6(v6) = addr && let Some(v4) = v6.ip().to_ipv4_mapped() {
        // seems reasonable
        SocketAddr::V4(SocketAddrV4::new(v4, v6.port()))
    } else if let SocketAddr::V6(v6) = addr && v6.ip().is_loopback() {
        // potentially unreasonable
        SocketAddr::V4(SocketAddrV4::new(Ipv4Addr::LOCALHOST, v6.port()))
    } else {
        addr
    }
}

Thanks for making this library!

BiagioFesta commented 2 months ago

Thank you for the feedback.

While it feels more reasonable to return an IPv4 address to me, I am wondering what should be the "correct behavior". Do we have some reference about that?

finnbear commented 2 months ago

Good question! I tried to find the documentation that best answers it (emphasis and translations mine). Analysis and conclusion at the end.

4.2. IPv6 Applications in a Dual-Stack Node

...

Most implementations of dual stack allow IPv6-only applications to interoperate with both IPv4 and IPv6 nodes. IPv4 packets going to IPv6 applications on a dual-stack node reach their destination because their addresses are mapped by using IPv4-mapped IPv6 addresses: the IPv6 address ::FFFF:x.y.z.w represents the IPv4 address x.y.z.w.

6. Developing IP Version - Independent Applications

As stated, dual applications working with both IPv4 and IPv6 are recommended. These applications should avoid IP dependencies in the source code. However, if IP dependencies are required, one of the better solutions would be to build a communication library that provides an IP version - independent API to applications and that hides all dependencies.

To develop IP version - independent applications, the following guidelines should be considered.

6.1. IP Version - Independent Structures

All memory structures and APIs should be IP version-independent. One should avoid structs in_addr Ipv4Addr, in6_addr Ipv6Addr, sockaddr_in SocketAddrV4, and sockaddr_in6 SocketAddrV6

Suppose a network address is passed to some function, foo(). If one uses struct in_addr Ipv4Addr or struct in6_addr Ipv6Addr, results an extra parameter to indicate address family, as below:

 struct in_addr in4addr;
 struct in6_addr in6addr;
  /* IPv4 case */
 foo(&in4addr, AF_INET);
  /* IPv6 case */
 foo(&in6addr, AF_INET6);

This leads to duplicated code and having to consider each scenario from both perspectives independently, which is difficult to maintain. So we should use struct sockaddr_storage SocketAddr, as below:

 struct sockaddr_storage ss;
 int sslen;
 /* AF independent! - use sockaddr when passing a pointer */
 /* note: it's typically necessary to also pass the length
    explicitly */
 foo((struct sockaddr *)&ss, sslen);

Analysis

RFC4038 Section 4.2 implies that mapped IPv4 is useful for IPv6-only applications. This sentiment is echoed here, here, and here.

RFC4038 Section 6 recommends dual-stack applications use sockaddr_storage SocketAddr with explicit IPv4 instead of sockaddr_in6 SocketAddrV6 with mapped IPv4.

Conclusion

Since this library supports dual-stack, I think the documentation supports making Connection::remote_address return SocketAddr::V4(SocketAddrV4) for IPv4 clients.

The one exception: If this library is configured in IPv6 only mode (not the default used by with_bind_default(443) which is documented as "ANY IP (allowing IP dual-stack)"), consider returning SocketAddr::V6(SocketAddrV6) for all clients. As far as I know, the IPv4 mapping for an IPv6 socket is done by the operating system, so it's possible this library wouldn't need a special case.

BiagioFesta commented 2 months ago

Thank you for the investigation; that's definitely a good start.

RFC-4038 suggests using an API with an IP-family independent structure, and it seems like the wtransport API already follows that approach, doesn't it?

Connection::remote_address returns an "unaware" SocketAddr (neither a SocketAddrV4 nor SocketAddrV6). From this perspective, it seems to be compliant with the above.

Adding to this, under the hood, Connection::remote_address returns the socket address provided by the system call. I haven't delved too deeply into it, but I'm wondering if we can simply trust the system implementation.

Instead, what I wanted to investigate is the usage of the mapped address. When obtaining the peer address, I want the application to be able to send data back to that same address, regardless of whether it is an IPv4 mapped address or not.

I did a quick experiment and, indeed, I was able to send packets back to [::ffff:127.0.0.1] (i.e., a only-V4 bound client) from a V6-dual stack server (Linux platform).

I am still not totally clear on the approach here.

finnbear commented 2 months ago

it seems to be compliant with the above.

Yeah, section 6 allows the current implementation and my first "expected behavior". I included it because it disallows my second "expected behavior" of always returning SocketAddrV6 for clarity (which would also be a breaking change).

Second 4.2 and the other references I linked are what implicitly disallow the current implementation, because the current implementation returns mapped addresses even when not targeting IPv6-only. I couldn't find any reference that explicitly disallows the current implementation of returning a mapped address when an unmapped address would suffice.

I'm wondering if we can simply trust the system implementation

You're right that this is an upstream issue (quinn or OS). Maybe documentation is the best path forward for this library.

impl {SessionRequest, Connection} {
    /// Returns the peer's UDP address.
    ///
    /// **Note**: as QUIC supports migration, remote address may change
    /// during connection. Furthermore, when IPv6 support is enabled, IPv4
    /// addresses may be mapped to IPv6.
    #[inline(always)]
    pub fn remote_address(&self) -> SocketAddr {
        self.quic_connection.remote_address()
    }
}
BiagioFesta commented 2 months ago

Absolutely, it sounds like we're on the same page. I am just trying to understand if we get some strong reference about that.

While we may not have a definitive reference to rely on, adjusting the documentation to clarify the behavior seems like a prudent step forward which definitely brings benefit quickly.

Regarding documentation, feel free to open a PR (adjusting the content as you think it is better) if you have time, otherwise I will in the next days

Again, thank you