microsoft / qsharp-compiler

Q# compiler, command line tool, and Q# language server
https://docs.microsoft.com/quantum
MIT License
682 stars 171 forks source link

Incorrect parsing of nested string literals #453

Closed abacabadabacaba closed 3 years ago

abacabadabacaba commented 4 years ago

This program fails to compile:

namespace Example {
    open Microsoft.Quantum.Intrinsic;
    @EntryPoint()
    operation Test() : Unit {
        Message($"{";"}");
    }
}

A very similar program compiles just fine:

namespace Example {
    open Microsoft.Quantum.Intrinsic;
    @EntryPoint()
    operation Test() : Unit {
        Message($"{"foo"}");
    }
}
bettinaheim commented 4 years ago

@abacabadabacaba Indeed - that's one edge case I didn't think of! You are not by any chance eager to participate in bug bashes? ;)

For anyone who is interested in picking that one up, I can give a detailed explanation of why this case isn't handled, and which code needs to be adapted. The issue would only occur if you have a string containing {,} or ; as an interpolated argument in a string. I don't know when I'll get around to looking at it, but it should be a fairly easy fix if anyone wants to pick it up - just ping me.

abacabadabacaba commented 4 years ago

This can also be triggered by a nested interpolated string. Something like $"Hello, {loggedIn ? $"user {userName}" | "anonymous"}".

bettinaheim commented 4 years ago

This can also be triggered by a nested interpolated string. Something like $"Hello, {loggedIn ? $"user {userName}" | "anonymous"}".

Yes, as I mentioned, any of '{', '}' or ';' within a string inside an interpolation argument would cause the issue. I believe there is also always a more or less convenient workaround. The case above for example could be expressed as "Hello, " + (loggedIn ? $"user {userName}" | "anonymous").

ricardoprins commented 4 years ago

Hey @bettinaheim, I'd like to help with this one :)

bettinaheim commented 4 years ago

@ricardoprins Awesome! Sorry for the delay in response. I believe a good fix for this one should in fact also fix this one: https://github.com/microsoft/qsharp-compiler/issues/457.

The Q# compiler does an initial preprocessing before even building the tokenization as part of the incremental compilation. That preprocessing among other things needs to track where strings (and comments) start and end such that the content of strings is treated as a "black box" when computing token ranges. This is done here.

I believe what should work well is to treat and track the curly brackets ({}) around interpolation arguments as string delimiters as well, by simple modifying the method linked above. What you would need to watch out for is that curly brackets only contain interpolated arguments if the string is prefaced by $ (i.e. "not an interpolated string {normal text}" vs $"interpolated string with some {arg}"). Additionally, any prefacing them by a backslash ($"normal curly: \{") makes them usable as char.

Take a look at the code, and let me know what you think.

ricardoprins commented 4 years ago

Alright! I'll take a close look at this later today! Thanks :)

bettinaheim commented 4 years ago

Hi @ricardoprins, How is it going with this one? Are you still interested in giving it a go? Otherwise I'll schedule it for us to pick up.

ricardoprins commented 4 years ago

Hi! I'd love to help, but I'm totally stuck in work projects, sorry for not going ahead with that.

As soon as I'm more free, I'll work on some other issue here :)

bettinaheim commented 4 years ago

Hi! I'd love to help, but I'm totally stuck in work projects, sorry for not going ahead with that.

As soon as I'm more free, I'll work on some other issue here :)

I know the problem :)

Pallavi-mp commented 4 years ago

Hey @bettinaheim I'd like to have a go at this. How do you test/debug the compiler code? I ran bootstrap.ps1 to build the code as mentioned in the documentation but not sure how to test/debug it after this. Is there some other documentation that I can refer to for this?

bamarsha commented 4 years ago

Hi @Pallavi-mp, you can test the compiler by running the CommandLineTool project or by adding unit tests. To use the command line tool, in the src/QsCompiler/CommandLineTool folder, you can run:

dotnet run -- build --input MyProgram.qs

to compile a Q# file named MyProgram.qs. You can try to reproduce the bug this way by putting the example program in the file.

For unit tests, you can add some test cases to the string parser tests: https://github.com/microsoft/qsharp-compiler/blob/79503a2a3ce31e9beeb7c65fc4d24fdc0c2cbdf4/src/QsCompiler/Tests.Compiler/SyntaxTests.fs#L72-L73

but the bug might not actually happen at this stage of parsing, so you may need to add cases to one of the Q# test files, like LocalVerification.qs, which is tested by LocalVerificationTests.fs.

ScottCarda-MS commented 3 years ago

This is resolved by #735.