src-d / style-analyzer

Lookout Style Analyzer: fixing code formatting and typos during code reviews
GNU Affero General Public License v3.0
32 stars 21 forks source link

Parsing error in from_node #329

Closed EgorBu closed 5 years ago

EgorBu commented 5 years ago

Feature extractor couldn't extract features from https://github.com/meteor/meteor (from 62fa9927ce34cff064cc3991439553e7c52b5258 to c3309b123a7220ac24cbe73661184ee946bca01f)

  File "/usr/local/lib/python3.5/dist-packages/lookout/core/event_listener.py", line 181, in wrapped_catch_them_all
    return func(self, request, context)
  File "/usr/local/lib/python3.5/dist-packages/lookout/core/event_listener.py", line 202, in wrapped_handle
    return getattr(self.handlers, method_name)(request)
  File "/usr/local/lib/python3.5/dist-packages/lookout/core/manager.py", line 97, in process_push_event
    model = analyzer.train(ptr, mycfg, self._data_service.get())
  File "/usr/local/lib/python3.5/dist-packages/lookout/core/data_requests.py", line 131, in wrapped_with_uasts_and_contents
    return func(cls, ptr, config, data_request_stub, files=files, **data)
  File "/home/egor/workspace/style-analyzer/lookout/style/format/analyzer.py", line 185, in train
    X, y, _ = fe.extract_features(sorted(files, key=lambda x: x.path))
  File "/home/egor/workspace/style-analyzer/lookout/style/format/feature_extractor.py", line 250, in extract_features
    file_vnodes, file_parents = self._parse_file(contents, uast, file.path)
  File "/home/egor/workspace/style-analyzer/lookout/style/format/feature_extractor.py", line 751, in _parse_file
    result.extend(VirtualNode.from_node(node, contents, path, self.token_unwrappers))
  File "/home/egor/workspace/style-analyzer/lookout/style/format/feature_utils.py", line 107, in from_node
    Position(*[f[1] for f in node.start_position.ListFields()]),
TypeError: __new__() missing 1 required positional argument: 'col'

How to reproduce: 1) Launch format analyzer analyzer run lookout.style.format -c config.yml --log-level DEBUG 2) train the model lookout push ipv4://localhost:2000 --git-dir /path/to/meteor --from 62fa9927ce34cff064cc3991439553e7c52b5258 --to c3309b123a7220ac24cbe73661184ee946bca01f

m09 commented 5 years ago

Thanks! Can you provide your config.yml just to make sure we use the same stuff?

EgorBu commented 5 years ago

Yes, of course. It's default one

server: 0.0.0.0:2000
db: sqlite:////tmp/lookout.sqlite
fs: /tmp
EgorBu commented 5 years ago

One more repo with the same issue: https://github.com/webpack/webpack --to babe736cfa1ef7e8014ed32ba4a4ec38049dce14 --from 3e74cb428af04eedac60ae13d2420d2b5bd3bde1

m09 commented 5 years ago

@EgorBu: was this still a problem during your last runs?

m09 commented 5 years ago

@EgorBu ping

zurk commented 5 years ago

I run quality report run yesterday and yes, the problem still exists.

zurk commented 5 years ago

So, the problem is in bblfsh and happens on this file: https://github.com/meteor/meteor/blob/0fcc7ddd46d0ef8a278376ba64538210486ab646/packages/accounts-password/email_tests_setup.js

bblfsh does not create offset in some cases. Here is the begging of uast tree:

# Positions                                                    Token                   Internal Role            Roles Tree

line: 1 col: 1  line: 1 col: 1                                 ||                      File                     FILE
line: 1 col: 1  line: 1 col: 1                                 ||                      CommentLine              ┣ COMMENT
line: 1 col: 1  line: 1 col: 1                                 ||                      Program                  ┣ MODULE
offset: 190 line: 6 col: 1  offset: 190 line: 6 col: 1         ||                      VariableDeclaration      ┃ ┣ STATEMENT, DECLARATION, VARIABLE
line: 1 col: 1  line: 1 col: 1                                 ||                      CommentLine              ┃ ┃ ┣ COMMENT
offset: 3 line: 2 col: 1  offset: 3 line: 2 col: 1             | a mechanism to inte|  CommentLine              ┃ ┃ ┣ COMMENT
offset: 67 line: 3 col: 1  offset: 67 line: 3 col: 1           | the string "interce|  CommentLine              ┃ ┃ ┣ COMMENT
offset: 133 line: 4 col: 1  offset: 133 line: 4 col: 1         | be retrieved using |  CommentLine              ┃ ┃ ┣ COMMENT
offset: 187 line: 5 col: 1  offset: 187 line: 5 col: 1         ||                      CommentLine              ┃ ┃ ┣ COMMENT
offset: 196 line: 6 col: 7  offset: 196 line: 6 col: 7         ||                      VariableDeclarator       ┃ ┃ ┣ DECLARATION, VARIABLE
offset: 196 line: 6 col: 7  offset: 196 line: 6 col: 7         |interceptedEmails|     Identifier               ┃ ┃ ┃ ┣ EXPRESSION, IDENTIFIER
offset: 216 line: 6 col: 27  offset: 216 line: 6 col: 27       ||                      ObjectExpression         ┃ ┃ ┃ ┗ INITIALIZATION, EXPRESSION, MAP, LITERAL

Hugo said that lastest driver does not have this bug.