duaraghav8 / solparse

Solidity Parser for Solium
https://www.npmjs.com/package/solparse
29 stars 15 forks source link

Differences between solparse & solidity-parser #15

Closed duaraghav8 closed 7 years ago

duaraghav8 commented 7 years ago

This thread is for @cgewecke, will keep on making edits to this comment to add differences.

1. Parsing of Pragma statements in solparse

pragma solidity ^4.1.0 yields:

{
  "type": "PragmaStatement",
  "version": "4.1.0",
  "upward_compatible": false,
  "start": 0,
  "end": 22
}

But I find federicobond's suggestion more sensible.

2. Irrelevant Data being created by solidity-parser, fixed in solparse

refer to https://github.com/ConsenSys/solidity-parser/issues/56

cgewecke commented 7 years ago

Thanks for doing this @duaraghav8.

1. Pragma in SP

pragma solidity ^0.4.3 yields

{
  "type": "PragmaStatement",
  "start_version": {
     "type": "VersionLiteral",
     "operator": "^",
     "version": "0.4.3"
   },
  "end_version": null,
  "start": 0,
  "end": 23
},

So it's different than what frederico suggested - just has slightly more granularity than solparse. Looking at the rules for imports-on-top and pragma-on-top it seems like SP's AST is not a conflict for Solium?

2 Top level

There are significant differences at the top level between the two parsers. Solidity-parser has gradually become less permissive during the fall/winter and it seems like one of the goals is to make it as precise as possible.

I'm going to do the same as you suggested in the Solium SP integration issue and try to see what crashes when you plug solidity-parser into Solium. . .

cgewecke commented 7 years ago

3 Import

solidity-parser now requires import to be a top level declaration so that linting rule gets pre-empted. Solium tests produce a false positive because parser beats them to the complaint. Fix by removing test?

4 With

solidity-parser removed the with token. no-with.js also pre-empted, false positive, same fix.

5 uint [] x

solidity parser rejects these extra spaces but they are legal. Bug fix there.

6 Tuples

Bug fix there.

7 Structs as function arguments

call ({name: "foo"\n, age: 20,id: 1 ,dept: "math"}) This snippet comes from a whitespace.js test. Solidity-parser rejects it but maybe that's correct? Not sure. This is the grammar frederico is using. Again this fix would just be a test change.

cgewecke commented 7 years ago

Last but not least!

8 How function body's brackets are managed.

Solparse gives the brackets to FunctionBody, solidity parser gives them to the parents of FunctionBody here and here. This messes up the start/end values for the node and causes a bunch of bracket tests to fail. Isn't solparse's implementation the correct one? Bug fix there?

@duaraghav8 I think if changes are made to deal with 3 through 8 all of solium's tests will pass. Number two looks bad but doesn't seem to be breaking anything as far as I can tell. The only other thing will be that the unit tests will have to run a patched parser build that doesn't complain about being fed raw statements.

What's your take on all this?

duaraghav8 commented 7 years ago

Thanks @cgewecke. I'm answering your questions point-by-point

1. Pragma

No conflict here, we're good to go.

2. Top Level

By "less permissive", I'm guessing you mean that SP now only parses code if its in proper structure (Statements enclosed inside Contract, so if you only try to parse a single statement, throws error). Am I right? If this is what you mean, then yes, we will make solparse more restrictive (just like SP) too.

3. Import

federico pointed this out too, I'll remove the test and rule imports-on-top. A very important thing is to consider backward compatibility here. I'll either have to deprecate Solium v0 and start fresh at v1 or continue support by editing config files of users when they update solium. Do you have a take on this?

4. with

If SP is completely handling (and rejecting) with, then this rule & test will be removed too.

5 & 6

no comment

7 Struct

I'm really not sure about this right now. I'll ask in the Solidity gitter chat room and then we can decide.

duaraghav8 commented 7 years ago

8. Start/end position

Ok, this one is very crucial, I fought a long and hard battle to finally fix it. Yes, IMO, solparse's implementation for positions IS the right one. Have a look at this issue.

The position info has potential to cause massive conflicts, so I suggest we fix this in SP. Let me know what you think.

ps- No. 2 nevertheless needs to be fixed in SP because the AST structure goes wrong because of that. Like I mentioned in the issue, <Expression>; (expression followed by semicolon) together becomes ExpressionStatement, but SP doesn't build the ExpressionStatement node outside the DeclarativeExpression node.

duaraghav8 commented 7 years ago

Here is how I fixed the position problem: https://github.com/duaraghav8/solparse/commit/5a5df692e9cc45f365b8ab58e1263143ef55c329

(ignore the console log statement in the commit, I forgot to remove it after debugging. I had removed it in a subsequent commit)

cgewecke commented 7 years ago

Ok awesome, all really good points! I'm going to have to look at them more closely tommorow. Its the end of the day for me, sadly.

cgewecke commented 7 years ago

2. Precision

Yes, exactly. I think frederico would like to have the parser match the specificity of the grammar. It's possible this would help with implementing some of the rules on the wishlist. The missing expression statement is a product of top level changes - if you look down at the end of SPs new pegjs it's a little clearer what has happened. The path to declarative expression now runs:

Program --> SourceUnits ---> Contract ---> SourceElements --> DeclarativeExpression

And DeclarativeExpression is going to be made more precise as well so the parser will mirror Solidity's distinction between the contract and function level declarations. This will involve some node type renaming that Solium would have to accomodate. But just by recognizing the type name and duplicating logic that already exists for DeclarativeExpression.

3. Import

That's a tough one and I'm sure you know best. If you just remove the rejection test so the suite passes, then the main difference will be that the user gets a vague parser error instead of a smart linter error? And the import logic won't actually run?

8. Fn bodies

I'm researching 8 a bit more today. SP's code is pegjs's code so the question is: why did they do this? function BlockStatements end up being parsed differently than if, for, while etc BlockStatements. Is the inconsistency intentional or an accident?

cgewecke commented 7 years ago

Some more notes on 8.

This is how esprima (the eslint parser) handles fn bodies:

export class FunctionDeclaration {
    readonly type: string;
    readonly id: Identifier | null;
    readonly params: FunctionParameter[];
    readonly body: BlockStatement;
    readonly generator: boolean;
    readonly expression: boolean;
    readonly async: boolean;
    constructor(id: Identifier | null, params: FunctionParameter[], body: BlockStatement, generator: boolean) {
        this.type = Syntax.FunctionDeclaration;
        this.id = id;
        this.params = params;
        this.body = body;
        this.generator = generator;
        this.expression = false;
        this.async = false;
    }
}

export class FunctionExpression {
    readonly type: string;
    readonly id: Identifier | null;
    readonly params: FunctionParameter[];
    readonly body: BlockStatement;
    readonly generator: boolean;
    readonly expression: boolean;
    readonly async: boolean;
    constructor(id: Identifier | null, params: FunctionParameter[], body: BlockStatement, generator: boolean) {
        this.type = Syntax.FunctionExpression;
        this.id = id;
        this.params = params;
        this.body = body;
        this.generator = generator;
        this.expression = false;
        this.async = false;
    }
}

There's no intermediating FunctionBody stuff. It's handled as a BlockStatement.

Their IfStatement is:

export class IfStatement {
    readonly type: string;
    readonly test: Expression;
    readonly consequent: Statement;
    readonly alternate: Statement | null;
    constructor(test: Expression, consequent: Statement, alternate: Statement | null) {
        this.type = Syntax.IfStatement;
        this.test = test;
        this.consequent = consequent;
        this.alternate = alternate;
    }
}

This would run the path: Statement --> BlockStatement and produce the same results as Solparse. The grammar isn't in the same form but I think this is an indication pegjs may have just gotten it wrong. Does that make sense or is it not necessarily as conclusive as it seems to me?

cgewecke commented 7 years ago

@duaraghav8 How would you like to proceed? Would like to open PR's in solidity-parser with your fixes for tuples, function bodies and array spacing so they can review (and hopefully merge)? I'm going to open another issue over there about the function body problem arguing that pegjs's implementation is unhelpful.

cgewecke commented 7 years ago

Edit: the above should read: 'Would you like to open PRs...

duaraghav8 commented 7 years ago

@cgewecke since you took time to understand the code of both the parsers and exact differences to fix them, I don't mind if you go ahead and make the PR (you deserve the contri credit for that). However, if you want me to make the PR, no problem. Let me know.

cgewecke commented 7 years ago

@duaraghav8 I really feel that you should get the contrib credit for these fixes since you did the work to solve them. I will only make them if you are too busy for this (i.e. it will be weeks before you can get to it) or it would just be more convenient for you. If that's the case - let me know - it's no problem.

Summary / Book-keeping: