dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
19k stars 4.03k forks source link

Formatter does not work correctly in conjunction with the SyntaxFactory for multiline xml doc comments #3140

Open Inspyro opened 9 years ago

Inspyro commented 9 years ago

Hi, after some hours of debugging I found a potential bug of Roslyn. When a PropertyDeclaration is generated with the SyntaxFactory a trailing whitespace with width 0 is automatically added. When the leading trivia of the next member is a multiline xml doc comment, it gets formatted wrong by the Formatter, because it is apparently confused by the trailing whitespace. When the trailing whitespace is removed from the generated PropertyDeclaration, the formatting works as expected.

Code sample:

private static void Main ()
{
  var treeRoot = CSharpSyntaxTree.ParseText (@"namespace X{class C{}}", new CSharpParseOptions()).GetRoot();
  var classC = treeRoot.DescendantNodes().OfType<ClassDeclarationSyntax>().Single();

  //NOTE: Calling WithTrailingTrivia() on member1 - fixes the problem.
  var member1 = SyntaxFactory.PropertyDeclaration (SyntaxFactory.ParseTypeName ("string"), "Member1").WithSemicolonToken(SyntaxFactory.Token(SyntaxKind.SemicolonToken));

  //member 2 has a multiline doc comment
  var member2 = SyntaxFactory.PropertyDeclaration (SyntaxFactory.ParseTypeName ("string"), "Member2").WithSemicolonToken(SyntaxFactory.Token(SyntaxKind.SemicolonToken))
      .WithLeadingTrivia (
          SyntaxFactory.ParseLeadingTrivia (@"/** this
  * is multiline 
  */"));

  treeRoot = treeRoot.ReplaceNode (classC, classC.AddMembers (member1, member2));

  FormatAndOutput (treeRoot);
}

private static void FormatAndOutput (SyntaxNode treeRoot)
{
  using (var workspace = new AdhocWorkspace())
  {
    var project = workspace.AddProject ("Test", LanguageNames.CSharp)
        .AddMetadataReference (MetadataReference.CreateFromAssembly (typeof (string).Assembly));

    var document = project.AddDocument ("Test.cs", treeRoot);

    //the formatter cannot handle empty whitespaces correctly for multiline doc comments:
    var formattedDocument = Formatter.FormatAsync (document, project.Solution.Workspace.Options).Result;

    Console.WriteLine (formattedDocument.GetTextAsync().Result.ToString());
    Console.Read();
  }
}

(I know that the property declarations are incomplete, but the problem persists when you add the getter setter)

When the code is executed the output is:

namespace X
{
    class C
    {
        string Member1;
        /** this
* is multiline
*/
        string Member2;
    }
}

which is wrong, because the second/third line of the multi line comment has a wrong format.

When the trailing trivia of member1 is removed the output is:

namespace X
{
    class C
    {
        string Member1;/** this
      * is multiline
      */string Member2;
    }
}

which seems (more) correct

I found out that in the first case with the trailing whitespace, the calculation of the intendationDelta in the CSharpTriviaFormater.FormatDocumentComment has a negative value, but should be 0.

Maybe the SyntaxFactory builds a wrong syntax tree, or the Formatter has a bug. Hope that helps making Roslyn even better.

mattwar commented 7 years ago

All tokens and nodes created with the factory have zero-width elastic trivia leading and trailing by default. This type of trivia is used to indicate to the formatter that it is free to apply formatting rules in these places and change trivia typed by the user or identified by the parser.

The formatter may have some bugs related to the existence of this trivia.