tree-sitter / tree-sitter-c-sharp

C# Grammar for tree-sitter
MIT License
177 stars 47 forks source link

Fix C# raw_string_literal #293

Closed gonglinyuan closed 1 year ago

gonglinyuan commented 1 year ago

Example program:

# define DEBUG

using System;

/** <summary>Class constructor</summary>
  * <param>param1</param>
*/
public class HelloWorld {
  /// <summary>
  ///   This is a C sharp docstring.
  /// </summary>
  public static void Main(string[] args) {
    int x, y;
    string z, zz;
    char zzz;
    string tb1, tb2, tb3;
    for (x = 0; x < 10; x++) {
      y = x + 1;
      break;  // 123
      // this is another comment.
    }
    /* This is a block comment,
    that has two lines.
    */
    if (x < 5) {
      y = 2;
    }
#line 100 "special"
    z = "123";
    zz = @"456";
    zzz = 'T';
    tb1 = """text block 1""";
    tb2 = """
    text block 2
    """;
    tb3 = """"
    text block 3
    """";
    Console.WriteLine($"Hello, {z}! Do you know {tb1}?");
    var bytes = "hello"u8;
#if DEBUG
    Console.WriteLine("Debug version");
#endif
  }
}

In this program, tb1, tb2 and tb3 are assigned with three raw_string_literals. However, the parser in tree-sitter considers them as a single raw_string_literal of

text block 1""";
    tb2 = """
    text block 2
    """;
    tb3 = """"
    text block 3
    "

which is incorrect. A change in grammar can fix this issue.

gonglinyuan commented 1 year ago

Hi @gonglinyuan thanks for this, it looks like a good improvement.

Any chance you can add some new snippet tests to one of the corpus/.txt files to test the new scenarios where it failed before - that way we can be sure we don't regress it again in the future.

Let me know if you have questions.

I added a test to literals.txt

damieng commented 1 year ago

Thank-you for fixing this!