RomanYankovsky / DelphiAST

Abstract syntax tree builder for Delphi
Mozilla Public License 2.0
271 stars 116 forks source link

Add XMLComment type #178

Open JBontes opened 8 years ago

JBontes commented 8 years ago

XML comments could (at some future point) be parsed. Perhaps it's a good idea to have them labelled as XMLdoc comments separately.
http://docwiki.embarcadero.com/RADStudio/Berlin/en/XML_Documentation_Comments

See the proposed code changes in the follow-up post.

JBontes commented 8 years ago

Proposal:

procedure TmwBasePasLex.SlashProc;
var
  BeginRun: Integer;
  CommentText: string;
begin
  case FBuffer.Buf[FBuffer.Run + 1] of
    '/':
      begin
        if FBuffer.Buf[FBuffer.Run + 2] = '/' then begin
          Inc(FBuffer.Run, 3);
          BeginRun := FBuffer.Run;
          FTokenID := ptXMLComment;
        end
        else begin
          Inc(FBuffer.Run, 2);
          BeginRun := FBuffer.Run;
          FTokenID := ptSlashesComment;
        end;

        while FBuffer.Buf[FBuffer.Run] <> #0 do
        begin
          case FBuffer.Buf[FBuffer.Run] of
            #10, #13: Break;
          end;
          Inc(FBuffer.Run);
        end;

        if Assigned(FOnComment) then
        begin
          SetString(CommentText, PChar(@FBuffer.Buf[BeginRun]), FBuffer.Run - BeginRun);
          FOnComment(Self, CommentText);
        end;
      end;
  else
    begin
      Inc(FBuffer.Run);
      FTokenID := ptSlash;
    end;
  end;
end;

...
    ntWrite,

    ntXMLComment,
    ntAnsiComment,
    ntBorComment,
    ntSlashesComment
  );
....
procedure TPasSyntaxTreeBuilder.DoOnComment(Sender: TObject; const Text: string);
var
  Node: TCommentNode;
begin
  case TokenID of
    ptAnsiComment: Node := TCommentNode.Create(ntAnsiComment);
    ptBorComment: Node := TCommentNode.Create(ntAnsiComment);
    ptSlashesComment: Node := TCommentNode.Create(ntSlashesComment);
    ptXMLComment: Node := TCommentNode.Create(ntXMLComment);
  else
    raise EParserException.Create(Lexer.PosXY.Y, Lexer.PosXY.X, Lexer.FileName, 'Invalid comment type');
  end;

  Node.Col := Lexer.PosXY.X;
  Node.Line := Lexer.PosXY.Y;
  Node.FileName := Lexer.FileName;
  Node.Text := Text;

  FComments.Add(Node);
end;
...
    ntWrite,

    ntXMLComment,
    ntAnsiComment,
    ntBorComment,
    ntSlashesComment
  );
....
    'write',

    'xmlcomment',
    'ansicomment',
    'borlandcomment',
    'slashescomment'
  );
...
procedure TmwSimplePasPar.SkipXMLComment;
begin
  Expected(ptXMLComment);
end;
...
procedure TmwSimplePasPar.SkipJunk;
begin
  if Lexer.IsJunk then
  begin
    case TokenID of
      ptAnsiComment:
        begin
          SkipAnsiComment;
        end;
      ptBorComment:
        begin
          SkipBorComment;
        end;
      ptSlashesComment:
        begin
          SkipSlashesComment;
        end;
      ptXMLComment:
        begin
          SkipXMLComment;
        end;
      ptSpace:
        begin
          SkipSpace;
        end;
      ptCRLFCo:
        begin
          SkipCRLFco;
        end;
      ptCRLF:
        begin
          SkipCRLF;
        end;
      ptSquareOpen:
        begin
          CustomAttribute;
        end;
    else
      begin
        Lexer.Next;
      end;
    end;
  end;
  FLastNoJunkPos := Lexer.TokenPos;
  FLastNoJunkLen := Lexer.TokenLen;
end;
...
function IsTokenIDJunk(const aTokenID: TptTokenKind): Boolean;
begin
  Result := aTokenID in [
    ptAnsiComment,
    ptBorComment,
    ptCRLF,
    ptCRLFCo,
    ptSlashesComment,
    ptXMLComment,
    ptSpace,
    ptIfDirect,
    ptElseDirect,
    ptIfEndDirect,
    ptElseIfDirect,
    ptIfDefDirect,
    ptIfNDefDirect,
    ptEndIfDirect,
    ptIfOptDirect,
    ptDefineDirect,
    ptUndefDirect];
end;

Tested using the following code:

unit RepeatWhile;

//comment

interface

implementation

procedure DoWhile;
begin
  while true do begin
    break;
  end;
  while false do DoWhile;
end;

///comment

procedure DoRepeat;
begin
  repeat
    break;
  until false;
  repeat until true;
end;

end.
NickRing commented 8 years ago

A bit concern of possible buffer overruns on:

case FBuffer.Buf[FBuffer.Run + 1] of '/': begin if FBuffer.Buf[FBuffer.Run + 2] = '/' then begin

vintagedave commented 8 years ago

Why would DelphiAST parse XML, ever?

Wouldn't you handle these comments by extracting the comment text from DelphiAST and then handling them with a XML parser?

JBontes commented 8 years ago

The only thing I'm doing is tagging XMLdoc comments as comments. Comments that happen to start with /// . That's all. I'm not parsing anything.

Wouldn't you handle these comments by extracting the comment text from DelphiAST and then handling them with a XML parser?

Yes, that is exactly my plan.

vintagedave commented 8 years ago

Ah! I misunderstood; I thought some of the comments would be parsed and their content available in the tree.

Sorry. You can see why I was puzzled :)

On 11 May 2016 at 17:43, JBontes notifications@github.com wrote:

The only thing I'm doing is tagging XMLdoc comments as comments. Comments that happen to start with /// . That's all. I'm not parsing anything.

Wouldn't you handle these comments by extracting the comment text from DelphiAST and then handling them with a XML parser?

Yes, that is exactly my plan.

— You are receiving this because you commented. Reply to this email directly or view it on GitHub https://github.com/RomanYankovsky/DelphiAST/issues/178#issuecomment-218481320

JBontes commented 8 years ago

A bit concern of possible buffer overruns on:

case FBuffer.Buf[FBuffer.Run + 1] of
'/':
begin
if FBuffer.Buf[FBuffer.Run + 2] = '/' then begin

There is no problem here.
If the buffer ends, an exception will be generated.
However that exception would have been generated anyway due to "unexpected end of source file".

Remember the buffer is the entire unit. And DelphiAST stops parsing after the end..
If it would parse after that, then you'd have a point.
I could put a try-except in there.

RomanYankovsky commented 8 years ago

Any comment can start with /// And xml comments are not one-liners.

JBontes commented 8 years ago

I know and I can just start my hunt for xmldoc in the heap of slashescomments. Every // that starts with a / and has valid XMLDoc in it. Feel free to close the issue.

NickRing commented 8 years ago

Don't get me wrong - any work on this is good but it needs to be safe. My concern is if the last character is a '/' - then the code I highlighted will read 1 or 2 characters pass the end of the file (in memory) and depending on the Range Checking is on or not, an exception might not be raised (ie, if it is turned off).

It is possible to ensure Range Checking is turned on but anyone could turn it off again...

Just being a Negative Nancy :-)

JBontes commented 8 years ago

Nick, there is zero change of a exception with reading 2 bytes past the end of the file.
We already know the first / is OK and after that there is a guaranteed #0 char.
It is simply impossible for there to be an exception.