Testing of an interpreter

rust-analyzer / rowan

Apache License 2.0

689 stars 57 forks source link

Testing of an interpreter #72

Closed lunacookies closed 4 years ago

lunacookies commented 4 years ago

Hi,

I’m currently working on an interpreted language that uses rowan to represent its syntax trees. To interpret code, I’ve created a typed layer on top of Syntax{Node,Element,Token}s using the macro-based approach used in the s-expressions example. I added tests as I wrote the parser, asserting that the Debug representation of SyntaxNode looked as expected.

However, I did not add tests as I wrote either the typed layer or the interpreter. Just for context, the interpreter is wholly based on the typed layer, and does not use any other data structures. After completing the initial implementation of the interpreter, I found that it was laden with bugs – I guess that’s what you get when you don’t write tests.

How can I test the interpreter if I can’t construct Syntax{Node,Element,Token}s by hand? Is the recommended way to invoke the relevant sub-parsers to create typed layer nodes from strings?

CAD97 commented 4 years ago

Syntax nodes are purely a view of a green tree, and cannot exist without a green tree.

Your parser builds a green tree, but nothing prevents you from writing down the steps the parser would take rather than just parsing some text.

If you have good enough test coverage of your parser that you're fairly certain it's correct, though, there's no specific reason to not reuse it to represent input for later stages in a more human-understandable format, though. As a rule, any representation of the green tree that isn't the language it's representing is going to be more verbose and likely error-prone; the green tree stores the full text and as such contains strictly more (redundant) information.

(You could imagine a setup like YAML where deduplicated nodes are actually deduplicated in the serialized format, but that's still quite difficult to do.)

As an example, rust-analyzer's main tests are basically "here's a snippet of Rust code, here's some facts that should hold in the knowledge database." It's more "integration" than "unit," but in this case, I think that's fair. At some point you have to just assume the bits you're built on work.

lunacookies commented 4 years ago

Thanks for the info, I guess that’s the route I‘ll take (since I am pretty happy with the parser’s test coverage for these purposes).

At some point you have to just assume the bits you're built on work.

That’s a good point – in any case, if the parser does break, I’d hope that both the parser’s tests and possibly the interpreter’s tests would catch it.

PS: I just wanted to say how cool I find this whole idea of deduplicating entire syntax trees is. Thanks to you, matklad and all the other contributors to rowan :)