scala / bug

Scala 2 bug reports only. Please, no questions — proper bug reports only.
https://scala-lang.org
230 stars 21 forks source link

Scaladoc 2.13.11 failing on illegal URI character in file name #12846

Closed mdedetrich closed 10 months ago

mdedetrich commented 10 months ago

Reproduction steps

Scala version: 2.13.11

final case class `X-Upload-Content-Type` private (contentType: ContentType)

Run scaladoc (i.e. doc in sbt) to generate Scaladoc from the above Scala code

Problem

With the above code I get the following error

[error] java.net.URISyntaxException: Illegal character in path at index 143: https://github.com/apache/incubator-pekko-connectors/tree/main/google-common/src/main/scala/org/apache/pekko/stream/connectors/google/scaladsl/`X-Upload-Content-Type`.scala#L37

This appears to be a regression in Scala 2.13.11 as it works fine in Scala 2.13.10, see https://github.com/apache/incubator-pekko-connectors/pull/139 for more context

Complete stacktrace can be seen here https://gist.github.com/mdedetrich/43d822f08e329f98d72e8131fd807e94

som-snytt commented 10 months ago

The difference is not in the source but the filename, which includes backticks in the project. (!)

scaladoc -doc-source-url "http://acme.com/€{FILE_PATH_EXT}#L€{FILE_LINE}" \`fubar\`.scala

errors as shown in 2.13.11.

[error] java.net.URISyntaxException: Illegal character in path at index 143: https://github.com/apache/incubator-pekko-connectors/tree/main/google-common/src/main/scala/org/apache/pekko/stream/connectors/google/scaladsl/`X-Upload-Content-Type`.scala#L37

Here is the github "permalink", which is encoded. Possibly Firefox encodes it on click anyway?

https://github.com/apache/incubator-pekko-connectors/blob/329eaa8501a5194655c2340fbb226124a57c9df7/google-common/src/main/scala/org/apache/pekko/stream/connectors/google/scaladsl/%60X-Upload-Content-Type%60.scala

The change was replacing deprecated URL constructor with URI, which notices that the backticks are not "encoded" as hex.

Maybe it should use relativized File#toURI/Path#toUri which encodes bad chars (not toURL, which does not!).

The spec says plainly that a backquoted identifier excludes the backquotes:

Finally, an identifier may also be formed by an arbitrary string between backquotes (host systems may impose some restrictions on which strings are legal for identifiers). The identifier then is composed of all characters excluding the backquotes themselves.

So backticks in source file names is not motivated.