Closed harrysarson closed 3 years ago
Thanks a lot for this! If you don't mind, I'd prefer keeping my focus on finishing my current work on pubgrub-rs before switching back to this. Didn't know you'd port that so fast ^^. I'll probably not answer for a couple of weeks but feel free to continue discussing things here and making progress on that with @lydell until I join the discussion later :)
Didn't know you'd port that so fast
It turned out to be quite straightforward! I guess the hard bit is speccing out the design space, "code being the easy part" after all.
Exciting! :tada:
Thanks for the feedback :) I will let this sit for a couple weeks before starting the polishing just to give @mpizenberg some time.
Just a small update to let you know I'll soon be able to review this, in the coming week very likely
I have got the tree sitter ball rolling here https://github.com/Razzeee/tree-sitter-elm/pull/57
I've had a very quick look at the shape of the PR (not the logic yet). I feel like there are quite a few things that add noise because they are not directly related to the "no elmi" implementation. I'll try to extract those first in another PR to have a cleaner diff. "Make the change easy" as would say Dillon.
I've another high-level question. The PR adds a build step using the "cc" crate. I suppose that's because of tree sitter. Is it going to be an issue for portability and to generate a statically compiled executable, preferably based on musl?
Regarding the logic of the PR, would you mind do a summary of the necessary changes to what's existing and of what's new?
I'll try to extract those first in another PR to have a cleaner diff.
I can have a go at that, mainly changes to src/run.rs
I think. Saying that there are not many unrelated changes, this parser approach requires slightly different input information and produces slightly different output information so there are needed changes to the run code to accommodate.
I've another high-level question. The PR adds a build step using the "cc" crate. I suppose that's because of tree sitter. Is it going to be an issue for portability and to generate a statically compiled executable, preferably based on musl?
I can and will remove the dev dep on cc
(it is a left over from the first draft of this PR when tree-sitter-elm had not yet been published to crates.io. There will be no custom build step in this repository.
Saying that, there will still be a build step compiling the tree-sitter c code (just managed by the tree-sitter-elm crate rather than by us). The libraries created by the build step are statically linked into the executable and will use the configured compiler. Having a build step is 100% compatible with static musl binaries.
Regarding the logic of the PR, would you mind do a summary of the necessary changes to what's existing and of what's new?
Certainly, watch this space...
I've updated master and merged changes here to remove things unrelated to the PR. I'll take some time in the coming days to remind myself how elm-test-rs works (it has been some time ^^) to better understand what that PR brings.
I've started by looking at the logic to get modules names. I'm wondering, would it be possible to extract modules names inside parser::all_tests
such that TestModule
has a name
field that would be extracted by tree sitter? That would avoid all the code related to get_module_name
.
extract modules names
Do you mean extracting this part?
module Some.Name exposing
^^^^^^^^^
If so, just as a heads up: In node-test-runner we don’t use that module name. We just skip over it when parsing. Instead, we calculate the module name from the file path and source directories. This is because you can’t trust the module name – it can be wrong. In node-test-runner we don’t want to generate an import Some.Name
where that points to no existing file – that gave bad error messages. Instead we ensure our generated imports are correct so the Elm compiler can report the module
line being wrong in the user’s file. Sorry if this isn’t at all what you’re talking about.
I see. I thought there was no point double checking that this line is correct since the elm compiler also does it. In elm-test-rs we do this this just before parsing all tests:
eprintln!("Compiling all test files ...");
compile(
&tests_root, // current_dir
&options.compiler, // compiler
"/dev/null", // output
&module_paths, // src
);
where this basically calls elm make [module_paths]... output=/dev/null
.
At that point I thought the elm compiler already checks that all tests file are valid including their import
statements, and we can safely use those lines.
Actually, that compilation step was needed before in order to get the .elmi
files. It seems though that this compilation is not needed anymore, so no safe net anymore if I remove it.
So if I'm getting it right, the function add_kernel_test_checking
patches Test
variants definitions to detect any user definition of type Test
at runtime when initializing a worker for Runner.elm
. It works thanks to the fact that any test is defined with one the variants of the base Test
type, and because the elm compiler does not do optimizations such as inlining that could sidestep the variants patch.
If that is right, I've only one thing left to review, the parse::all_tests
function. I have not read it yet, but am right to assume that given add_kernel_test_checking
checks if something is a test, parse::all_tests
would actually return all exposed top level definitions, and not all tests?
parse::all_tests
would actually return all exposed top level definitions, and not all tests?
I don’t know about this Rust version, but for node-test-runner this is almost correct. In node-test-runner, the function is called extractExposedPossiblyTests
. It extracts everything that could be a test, so not Uppercase
exposed stuff, and not f x =
(when exposing (..)
, because a function can never be a Test
, and this means we can simplify the parser by only looking for names followed by an equals sign basically).
It works thanks to the fact that any test is defined with one the variants of the base Test type
Just realizing, this would not detect a "test in the making" that would be temporarily defined by Debug.todo "write this test"
. But I suppose that's a very niche case. Do you deal with that in node-test-runner?
Just realizing, this would not detect a "test in the making" that would be temporarily defined by Debug.todo "write this test".
Just just realizing that this would be a constant definition with a Debug.todo
and those make the program crash on start anyway so we don't need to care about this
Is tree-sitter a parser generator? I mean, the tree-sitter-elm repo seems to mainly contain JS code, and yet it seems that the build step for the rust package only build two files, parser.c
and scanner.cc
. So I guess those are generated by the logic in the JS code?
I don’t know much about tree-sitter but I managed to find this:
https://tree-sitter.github.io/tree-sitter/
Tree-sitter is a parser generator tool and an incremental parsing library. It can build a concrete syntax tree for a source file and efficiently update the syntax tree as the source file is edited.
https://tree-sitter.github.io/tree-sitter/creating-parsers
In order to develop a Tree-sitter parser, there are two dependencies that you need to install:
- Node.js - Tree-sitter grammars are written in JavaScript, and Tree-sitter uses Node.js to interpret JavaScript files. It requires the node command to be in one of the directories in your PATH. You’ll need Node.js version 6.0 or greater.
- A C Compiler - Tree-sitter creates parsers that are written in C. In order to run and test these parsers with the tree-sitter parse or tree-sitter test commands, you must have a C/C++ compiler installed. Tree-sitter will try to look for these compilers in the standard places for each platform.
It was quite some fun playing around the tree-sitter parser so I wanted to try its query system. I've rewritten get_explicit_exposed_values
in a way that uses tree-sitter queries in that last commit. What do you think?
I've moved all the code related to the query approach at the end of the file: https://github.com/mpizenberg/elm-test-rs/blob/77b762e38d35b3f30bff5bc5a541311f845dff06/src/parser.rs#L347
One advantage I can see is that we could get rid of the ChildCursor
struct, the check_kind
, skip_comments
, child
, next_sibling
helper functions, and the ExplicitExposedValuesError
type.
I cleaned up by removing the tree-walking approach so that only the code for the query approach is left. It looks quite nice in my opinion but we can revert the commit if there are things you don't like.
Hey! This is looking great! To apologise the product of free time and motivation that powers open source is looking very low for me at the moment and with Christmas coming it might stay that way to a while.
I hadn't seen the query system, it looks much better than what I had. I always felt that that must be a better API but I couldn't find it, looks like it is queries!
I like the new changes a lot :) feel free to run with this, I will try to chip in when I can :)
I feel you, I have those time too ^^, thanks for the feedback @harrysarson and of course take the time you need, no pressure.
@harrysarson @lydell Thanks a lot for this contribution! I'm about to merge this. Do you mind if I add the two of you in the authors
field of the Cargo.toml
file? And if that's ok, which email should I include?
Thanks! At this point, I’d rather not be listed as an author, since I have 0 commits and don’t know Rust. A few bits of comments and Elm and maybe test cases might be copied from stuff I wrote, but I don’t mind. I’m just happy to have been able to help!
I wouldn't say no! Thank you that is very kind. harry.sarson@hotmail.co.uk :)
Awesome, I'll add a commit on master directly :+1:
Some notes from recent tests. The package elm-geometry is one such package that exhibits a duplicate test names error with elm-test-rs. Here is the result of me searching for one such conflicting name:
$> rg "Every point on an arc is within its bounding box" tests/
tests/Tests/EllipticalArc2d.elm
159: "Every point on an arc is within its bounding box"
tests/Tests/Arc3d.elm
118: "Every point on an arc is within its bounding box"
tests/Tests/Arc2d.elm
183: "Every point on an arc is within its bounding box"
tests/Tests/EllipticalArc3d.elm
87: "Every point on an arc is within its bounding box"
Also in my last few commits when I extracted check
out into the line |> List.filterMap check
I introduced a bug. Indeed the list above is not homogeneous before applying the check
function. So I've now restored the behavoir of inlining the check
for the list items and filtering with List.filterMap identity
.
Applying the work done in https://github.com/rtfeldman/node-test-runner/pull/442 to the rust port of the test runner. Credit for the should go to @lydell as this is a port of their work.
Interested to hear feedback on: