dart-lang / sdk

The Dart SDK, including the VM, JS and Wasm compilers, analysis, core libraries, and more.
https://dart.dev
BSD 3-Clause "New" or "Revised" License
10.23k stars 1.58k forks source link

Uri does not encode (or double-encodes) non-URL safe base64 path segments. #57015

Open diegotori opened 3 days ago

diegotori commented 3 days ago

When attempting to create a Uri with a base64 encoded path segment value that is NOT URL safe (i.e containing illegal characters), it does not encode it (or double-encodes it when encoding it prior to creating the instance), when either creating a URI from scratch, or when replacing an existing one.

In other words, when encoding the base64 value using Uri.encodeComponent and creating the Uri using Uri.parse, it properly converts the unsafe == into properly encoded %3D%3D characters.

However, when creating the Uri either from its constructor, or when calling replace on an existing one, when placing an unencoded base64 value as a path segment, it returns == when calling toString on the resulting instance. Furthermore, when encoding the value using Uri.encodeComponent and placing it as a path segment, it double-encodes the already encoded %3D%3D value to %253D%253D instead.

Here is code that highlights this issue:

void main() {
  final baseUri = Uri(
    scheme: "https",
    host: "www.something.com",
    pathSegments: ["share"],
  );

  // Given an unsafe base64 URL value
  const unsafeBase64UrlValue = "c29tZSB2YWx1ZQ==";

  // When encoding it for use in a URL.
  final encoded = Uri.encodeComponent(unsafeBase64UrlValue);

  // Properly retains the parsed URL's encoded path segments.
  final properlyEncoded =
      Uri.parse("https://www.something.com/share/$encoded").replace(
    queryParameters: {
      "foo": "bar",
    },
  );
  print("Properly encoded base64 path URI: ${properlyEncoded.toString()}");

  // Does not encode the unsafe base64 path segment
  final unencodedBase64PathUri = baseUri.replace(
    pathSegments: [
      "share",
      unsafeBase64UrlValue,
    ],
    queryParameters: {
      "foo": "bar",
    },
  );
  print("Unencoded base64 path URI: ${unencodedBase64PathUri.toString()}");

  // Double encodes the encoded base64 path segment
  final doubleEncodedBase64PathUri = baseUri.replace(
    pathSegments: [
      "share",
      encoded,
    ],
    queryParameters: {
      "foo": "bar",
    },
  );
  print(
      "Double encoded base64 path URI: ${doubleEncodedBase64PathUri.toString()}");
}

Also available as a DartPad.

Currently running the following Dart version on macOS 14.5:

Dart SDK version: 3.5.3 (stable) (Wed Sep 11 16:22:47 2024 +0000) on "macos_arm64"
dart-github-bot commented 3 days ago

Summary: The Uri class does not properly encode or double-encodes non-URL safe base64 path segments when creating a new Uri instance or replacing an existing one. This results in incorrect URIs with unencoded or double-encoded path segments.

lrhn commented 3 days ago

The = character is a valid pchar, it's a sub-delimiter, so it doesn't need to be escaped. That's why the replace function doesn't escape it.

The pathSegments argument is assumed to be unescaped source path segments (not pre-escaped URI path segments), so the % is (double) escaped.

The Uri.encodeComponent is used for query elements, like the name and value of ?name=value, so it escapes = characters, It doesn't know that the text will be used as a path segment where = would be OK.

So, all in all, working as intended.

diegotori commented 3 days ago

@lrhn I think it might make sense to point out this caveat in the existing documentation. Otherwise, other devs might get tripped up going forward.

lrhn commented 3 days ago

True, let's make it a documentation issue.