smithy-lang / smithy

Smithy is a protocol-agnostic interface definition language and set of tools for generating clients, servers, and documentation for any programming language.
https://smithy.io
Apache License 2.0
1.71k stars 202 forks source link

`createSmithyJarManifestUrl`: incorrect handling of complex paths #2103

Open kubukoz opened 5 months ago

kubukoz commented 5 months ago

Hi! I first bumped into this over a year ago, but so far had a workaround that seemed decent, and the usecase seemed pretty niche. Now me and my team maintain about 4 copies of the same workaround, so it's reaching a point where it'd be best to resolve upstream - right here.

What I'm trying to do

Originally, here's how we did it in smithy4s:

val modelsInJars = deps.flatMap { file => // java.io.File
   val manifestUrl = 
     ModelDiscovery.createSmithyJarManifestUrl(file.getAbsolutePath()) 
   try { ModelDiscovery.findModels(manifestUrl).asScala } 
   catch { 
     case _: ModelManifestException => Seq.empty //fallback to nothing
   } 
 }

So far so good.

The problem

Trouble begins when certain artifacts are involved, namely:

For example, org.polyvariant:test-library-core_2.13:0.0.1+123-SNAPSHOT from Sonatype Snapshots would be placed in:

$COURSIER_CACHE_LOCATION/v1/https/s01.oss.sonatype.org/content/repositories/snapshots/org/polyvariant/test-library-core_2.13/0.0.1%2B123-SNAPSHOT/test-library-core_2.13-0.0.1%2B123-SNAPSHOT.jar

Here's what happens when you use that file with ModelDiscovery:

I think this is an issue with ModelDiscovery.findModels.

The expected behavior can be achieved with the following workaround: treat the jar as a FileSystem, and use the Path API to get an URL:

Using.resource(
  FileSystems.newFileSystem(file.toPath())
) { fs =>
  val path = fs.getPath("META-INF", "smithy", "manifest")
  ModelDiscovery.findModels(path.toUri().toURL())
}

The path.toUri() part stringifies as: jar:file://$COURSIER_FILE_LOCATION/v1/https/s01.oss.sonatype.org/content/repositories/snapshots/org/polyvariant/test-library-core_2.13/0.0.1%252B123-SNAPSHOT/test-library-core_2.13-0.0.1%252B123-SNAPSHOT.jar!/META-INF/smithy/manifest - which, to me, sounds correct.

Request

I think there's more than one issue here:

  1. createSmithyJarManifestUrl going from URL (a relatively structured type) to a String (no structure at all), offering little to no type safety when it's used down the line
  2. findModels doing something strange with its parameter, discarding the urlencoding of the given path.

so the ideal solution would also be twofold:

Context, if you want to learn more about the issue we had and the workaround: https://github.com/disneystreaming/smithy4s/pull/850

Notes / questions

kubukoz commented 5 months ago

@mtdowling QQ, what kind of reproduction would work best for you? This scala-cli snippet showcases the problem (just have to switch the withSpecialCharacters flag on and off to see the difference between normal and "strange" version patterns)

mtdowling commented 5 months ago

The best possible repro would be a unit test showing the broken current behavior, but I know that’s challenging here. Or maybe a known dependency on maven or something we can test?

kubukoz commented 4 months ago

In theory this would suffice:

{
  "version": "1.0",
  "maven": {
    "repositories": [
      {
        "url": "https://s01.oss.sonatype.org/content/repositories/snapshots"
      }
    ],
    "dependencies": [
      "org.polyvariant:test-library-core_2.13:0.0.1+123-SNAPSHOT"
    ]
  }
}

but both Smithy CLI and the LSP claim they don't allow + in versions.