ethereum / solidity

Solidity, the Smart Contract Programming Language
https://soliditylang.org
GNU General Public License v3.0
23.35k stars 5.77k forks source link

Full account of comments in the AST #13865

Open ekpyron opened 1 year ago

ekpyron commented 1 year ago

Tooling asked us to (potentially optionally) include all comments in the AST for being able to support pretty-printing ASTs to near-identical sources (except whitespace differences, accounting for which would definitely go too far).

There's related discussion already in e.g. https://github.com/ethereum/solidity/issues/4559.

We'll need to weigh the complexity this would induce for scanning and parsing and check if this can be done in a reasonable amount of time (since this is non-critical).

We can consider removing comments on the expression level from the scope of this, since they may be a larger hassle than comments on the statement level.

We'd also need to decide whether to introduce full AST nodes (maybe that'd be the way to go for statement-level comments) or fields similar to the documentation field used for natspac comments (that may be the only choice for comments on the expression level, if we even consider those).

Some justification for spending time on this otherwise tangential issue is that we could use such comments for testing AST-based properties like the annotations produced by https://github.com/ethereum/solidity/pull/13378 in isoltest by introducing special comments with test expectations. In that sense, this weakly relates to our roadmap task https://github.com/ethereum/solidity/issues/13722

ekpyron commented 1 year ago

Actually, thinking about this a bit further, for the requested use case, it's basically irrelevant in which structure the comments are stored - so this can actually be solved very easily: We just have the scanner not skip over comments, but collect/accumulate them internally in a list/vector. Then whenever we create an AST node in the parser we move all the currently stored comments from the scanner into it.

In the AST, each node could then get an optional comment list field that plainly stores the comments and their source locations.

The only thing left would be where to store the comments past all other nodes of the AST, but they can just end up in the top-level source unit or something like that.

So yeah, if we only emit this somewhat weird addition to the AST conditionally upon explicit Standard-JSON request, this can actually be done rather smoothly, so we can indeed just go for it for 0.8.19 (especially since this should still allow using such comment annotations for testing new AST analysis properties a la https://github.com/ethereum/solidity/pull/13378)

coreyar commented 1 year ago

@cameel This seems like an interesting issue to work on. What do you think about the approach that @ekpyron described?