ConsenSysMesh / solidity-parser

Solidity Parser in Javascript
137 stars 53 forks source link

fix: Issue_38 allow varaibles to have left or right bracket #39

Closed sogoiii closed 7 years ago

sogoiii commented 7 years ago

@tcoulter @federicobond Resolves issue 38. This pull request is heavy handed.

imports.pegjs is identical to solidity.pegjs with the addition of simplifyImports to conform with the original output for truffle profiler.

Adding the test to the current repo will result in failures. But once patch is applied, test pass.

federicobond commented 7 years ago

Thanks @sogoiii for your contribution! I would like to hear Tim's opinion on this first, but I would prefer not having to maintain two separate parsers that are almost, but not quite equal.

I think the best course of action is moving the simplifyImports code to the main library and always rely on the full parser.

sogoiii commented 7 years ago

@federicobond I agree that having the two nearly identical files was not the best of solutions. I initially tried to create a rule to make it pass.

Since the import.pegjs is only extracting an array of external files [file1.sol, file2.sol], we could make a custom parser that conforms to the desired spec.

tcoulter commented 7 years ago

The only reason for the imports parser was to significantly increase performance when solely looking for import statements -- it's able to do this because it only has to look for a few constructs and ignore the rest. If you can maintain the same performance gains then I'm happy to merge the two, but I'm not sure we can.

There's no tests that show off the performance difference, but when I created the parser there was a noticeable one. If we can show that the difference is now negligible I'm happy to merge (in which case we should remove the imports parser entirely and just do the filtering on the JS end).

Either way, thanks @sogoiii. Happy to have you contributing.

federicobond commented 7 years ago

Can we solve the imports lookup with a couple of regular expressions? It will be less brittle than parsing a subset of the language.

regular_expressions

tcoulter commented 7 years ago

I tried for Truffle. That's how the solidity parser was born. :)

@federicobond has removed a lot of the JS elements from the spec recently and updated PEG.js to a new version, so I'll revisit the performance differences and see if it's changed.

On Mon, Dec 12, 2016 at 6:44 AM, Federico Bond notifications@github.com wrote:

Can we solve the imports lookup with a couple of regular expressions? It will be less brittle than parsing a subset of the language.

[image: regular_expressions] https://cloud.githubusercontent.com/assets/138426/21103154/058f790c-c060-11e6-918f-57047bbc8a9c.png

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ConsenSys/solidity-parser/pull/39#issuecomment-266448029, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFp1cvTC4bbYOllCSfXkt451qFgPesDks5rHV2xgaJpZM4LKKZL .

federicobond commented 7 years ago

Well, this was an interesting find: the full parser is now always faster than the imports parser by about 5% in all my test runs, and there is a lot of room for improvement from what I have seen in the past week.

I would go for deprecating the imports parser and running simplifyImports from the library code.

tcoulter commented 7 years ago

@federicobond Would you mind posting the code (or commands) you used for gauging performance? Interesting find indeed.

On Mon, Dec 12, 2016 at 10:39 AM, Federico Bond notifications@github.com wrote:

Well, this was an interesting find: the full parser is now always faster than the imports parser by about 5% in all my test runs, and there is a lot of room for improvement from what I have seen in the past week.

I would go for deprecating the imports parser and running simplifyImports from the library code.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ConsenSys/solidity-parser/pull/39#issuecomment-266513708, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFp1b1H5qMWRt_6hRsr0MvviwMy1S4vks5rHZTTgaJpZM4LKKZL .

federicobond commented 7 years ago

I performed 9 test runs with npm test and looked at the times for the built parser tests. After averaging the differences, I get a 6,15% cut in total time from the full parser as compared to the imports parser. I will try to post a more precise script later.

sogoiii commented 7 years ago

I plan on doing 'time truffle compile' in the solidity zepplin project. Given that those contracts are a little more complicated then the test contract file.

tcoulter commented 7 years ago

Alright guys, I'm still seeing a significant speed difference. I exaggerated the test a bit to show you some results, but I think you'll get the picture after showing you what I did.

I created two files, test/perf-standard.js, and test/perf-imports.js. Their contents looks like this, nearly identical, but each using a different parser:

./test/perf-standard.js

#!/usr/bin/env node
var SolidityParser = require("../index.js");
var fs = require("fs");
var path = require("path");
var source = fs.readFileSync(path.join(__dirname, "doc_examples.sol"), "utf8");

for (var i = 0; i < 1000; i++) {
  SolidityParser.parse(source, "solidity");
}

./test/perf-imports.js

#!/usr/bin/env node
var SolidityParser = require("../index.js");
var fs = require("fs");
var path = require("path");
var source = fs.readFileSync(path.join(__dirname, "doc_examples.sol"), "utf8");

for (var i = 0; i < 1000; i++) {
  SolidityParser.parse(source, "imports");
}

I only needed to make one run to come up with some results:

Tims-MacBook-Pro:test tim$ time ./perf-imports.js 

real    0m5.467s
user    0m5.506s
sys 0m0.045s

Tims-MacBook-Pro:test tim$ time ./perf-standard.js

real    2m18.177s
user    2m17.767s
sys 0m0.363s

As you can see the runtimes are drastically different. When only looking for imports the imports parser works much faster on large data sets (this is really handy when trying to get a sense of the dependency hierarchy of a set of code). I'm open to other ways of solving the problem if you have ideas, but maintaining an imports parser is the only one I could think of.

If it's not clear, the goal of the imports parser is to ignore as many statements as possible, and only process the ones that are on the same level as the import statement (i.e., top level/program level). For instance, pragma and contract are on the same level, but everything in between a contract's curly braces should be able to be ignored (although I'm probably ignoring them incorrectly, which is the cause of this bug).

tcoulter commented 7 years ago

I should add that I performed the above on master.

federicobond commented 7 years ago

OK, I see my mistake now. I had reset this branch to test other changes and I thought I was running the tests on a clean copy of master but alas, I wasn't.

federicobond commented 7 years ago

I think I have a solution: we can add the StringLiteral option to the BlockList rule in the imports parser, right before NonClosingBracketCharacter.

This will create the appropriate context for interpreting {} inside strings literally, and not as part of a nested structure. Your proposed test contract passes fine with this change @sogoiii.

sogoiii commented 7 years ago

@federicobond that makes sense. Plus minimum speed impact.

federicobond commented 7 years ago

@sogoiii can you update the pull request accordingly?

sogoiii commented 7 years ago

Ill adjust my pull request. @federicobond

sogoiii commented 7 years ago

@federicobond @tcoulter Alright, the pull request was adjusted.

tcoulter commented 7 years ago

Nice work guys. Thanks.