swiftlang / swift

The Swift Programming Language
https://swift.org
Apache License 2.0
67.65k stars 10.38k forks source link

Regex literals mess with the ability to comment out code #77725

Open dabrahams opened 6 days ago

dabrahams commented 6 days ago

Description

/ ... / bracketing is no longer a reliable way to comment out code.

Reproduction

/*
  // Get the set of manifests before getting individual manifests.
  #"GET /rustlang/manifests.txt"# .comesBefore(#"GET /rustlang/dist/.*/[^/]*\.toml"#),

  // Get a particular manifest before trying to download its content.
  #"GET /rustlang/dist/(?<_1>:[^/]*)/[^/]*\.toml"#
    .comesBefore(#"GET /rustlang/dist/(?<_1>:[^/]*)/[^/]*\.tar\.gz"#),

// Download an artifact before you upload it.
  #"GET /rustlang/dist/[^/]*/(?<_1>:[^/]*)-(?<_2>:[^/]*)-(?<_3>:[^/]*)\.tar\.gz"#
    .comesBefore(#"PUT /artifactory/test-repo/(?<_2>:[^/]*)/(?<_1>:[^/]*)/(?<_3>:[^/]*)/\k<_1>-\k<_2>-\k<_3>\.tar\.gz"#)
]
)
*/

I get

/Users/dave/src/Nick/Sources/Nick/Nick.swift:118:73: error: unterminated regex literal
116 | /*
117 |   // Get the set of manifests before getting individual manifests.
118 |   #"GET /rustlang/manifests.txt"# .comesBefore(#"GET /rustlang/dist/.*/[^/]*\.toml"#),
    |                                                                         `- error: unterminated regex literal
119 | 
120 |   // Get a particular manifest before trying to download its content.

Expected behavior

Compiles as a no-op.

Environment

swift-driver version: 1.115 Apple Swift version 6.0 (swiftlang-6.0.0.9.10 clang-1600.0.26.2) Target: arm64-apple-macosx15.0

Additional information

No response

hamishknight commented 5 days ago

I should note this isn't really specific to regex literals, e.g you also get the behavior with:

/*
let x = "*/"
*/
dabrahams commented 5 days ago

That's just a (very nice) reduction of my example, which didn't contain any regex literals either.

hamishknight commented 5 days ago

Right, I just wanted to clarify that the issue isn't the fact the compiler supports parsing regex literals, this would have been an issue before they were added

dabrahams commented 5 days ago

Oh. Arguably not a bug at all then. It was never reliable if you had the wrong characters in the code.

hamishknight commented 4 days ago

Yeah, I don't think we'd want to change the way we lex comments to avoid this, I think we'd want some custom delimiter version of a multiline comment; i.e a comment equivalent of #" where you can add characters to the starting delimiter that needs to then be matched in the ending delimiter