openrewrite / rewrite-kotlin

Work-in-progress implementation of Kotlin language support for OpenRewrite.
Apache License 2.0
43 stars 12 forks source link

Parsing issue: Back ticks are preserved in identifier. #479

Closed traceyyoshima closed 10 months ago

traceyyoshima commented 10 months ago
import java.lang.Integer.MAX_VALUE
val m = `MAX_VALUE`
kunli2 commented 10 months ago

from PSI perspective,

`MAX_VALUE`

is indeed an identifier, so I think it's fine.

    \---(35,54) | PROPERTY | KtProperty | Text: "val m = `MAX_VALUE`"
        |---(35,38) | val | LeafPsiElement | Text: "val"
        |---(38,39) | WHITE_SPACE | PsiWhiteSpaceImpl | Text: " "
        |---(39,40) | IDENTIFIER | LeafPsiElement | Text: "m"
        |---(40,41) | WHITE_SPACE | PsiWhiteSpaceImpl | Text: " "
        |---(41,42) | EQ | LeafPsiElement | Text: "="
        |---(42,43) | WHITE_SPACE | PsiWhiteSpaceImpl | Text: " "
        \---(43,54) | REFERENCE_EXPRESSION | KtNameReferenceExpression | Text: "`MAX_VALUE`"
            \---(43,54) | IDENTIFIER | LeafPsiElement | Text: "`MAX_VALUE`"
traceyyoshima commented 10 months ago

from PSI perspective,

MAX_VALUE is indeed an identifier, so I think it's fine.

Unfortunately, the PSI is not a reliable source to make decisions because it's generated by a high-level lexer later interpreted by the compiler.

Here is a link to the documentation: https://kotlinlang.org/docs/java-interop.html#escaping-for-java-identifiers-that-are-keywords-in-kotlin.

Consider what is happening in the code. The identifier

`MAX_VALUE`

is from Integer.MAX_VALUE. The MAX_VALUE field on Integer does not have the escapes, but the Kotlin code happens to be escaped. This is important because recipes like ChangeType will update the name if the identifier and type match.

Using ChangeType from Map to List. I'd expected a change:

`Map`

to

`List`

Other applicable cases, like escaped method names, would fail if we used the raw PSI text. The example is to show the issue, but identifiers should generally contain the correct text.

fun `escapedMethodName`(){}
knutwannheden commented 10 months ago

I think we should add a Quoted marker for this purpose. Many languages have that concept.

traceyyoshima commented 10 months ago

I think we should add a Quoted marker for this purpose. Many languages have that concept.

That sounds good -- I know this happens in rewrite-javascript as well, though I don't recall the rules for other languages off the top of my head.

That'll allow changes to occur on the identifier while preserving the escapes. And we could technically detect unnecessary escapes and remove them in a clean-up recipe.