github / semantic

Parsing, analyzing, and comparing source code across many languages
8.97k stars 452 forks source link

Unmarshall dealing with MISSING and ERROR nodes #638

Closed alanz closed 3 years ago

alanz commented 3 years ago

I am starting to experiment with interactive usage of tree-sitter with semantic.

One of the key motivations is fast error reporting.

Here is a trivial partly made up example.

-odu le(foo)
-xxxx(yyy).

The second line is a valid attribute definition, the first has an extra space and is missing a terminating ..

The sexp representation I get from tree-sitter is

(source_file
 (attribute (atom) (ERROR) (atom) (MISSING \".\"))
 (attribute (atom).        (atom)))`

But semantic (via the TH semantic-ast process) quite happily generates an AST, ignoring the two error values.

I know the goal of semantic is to analyse code, in a best effort way, even in the presence of bug, so this behaviour is fine.

But how could I modify the unmarshall so it also returns information on the ERROR and MISSING nodes?

patrickt commented 3 years ago

Yeah, we have a bug out for this: https://github.com/github/semantic/issues/574

If I remember correctly, the problem here is that unmarshalNode in semantic-ast’s AST.Unmarshal shouldn’t throw an UnmarshalError: what we want instead is for the resulting node’s members to wrapped in the Fail constructor of AST.Parse.Error. I took a shot at this a while back, but if memory serves, I ran into problems both with generics and with the types of AST nodes being inconvenient (they should really be (Type -> Type) -> Type -> Type so that we can do a higher-order map over the error functors (which are now hardcoded to AST.Parse.Err).

alanz commented 3 years ago

Yeah, we have a bug out for this: #574

Oops, I did do a search before reporting, but missed that.

And for my purposes, the current behaviour is fine, but doing something like making MatchM return a list of nodes with issues as well would work perfectly. As in it should not fail, just report as well.

patrickt commented 3 years ago

Closing this in favor of #574.