Closed HosseinYousefi closed 1 year ago
We have mainly 2 type of tests that run end-to-end, on bindings.
"Golden file" tests - generate and compare to checked-in reference bindings.
"Runtime" tests - import the reference binding into test file and use it. If reference binding is equal to generated binding, these tests passing shall imply that generated bindings are correct.
The "Golden file" tests generate and compare reference bindings for given config. It's so common that doing it is a one line [generateAndCompareBindings(config..)
].
You can probably commit both types of bindings as reference bindings, and then modify this function to compare both reference bindings.
For "Runtime" bindings, however, it probably leads to some amount of code duplication, because imports will be different.
There's also the issue that this will lead to ever-inflating number of committed bindings, which is not great for PR review workflow.
I believe there's another viable option: generating majority of test suite at runtime;
Something like:
Create a new dir to project root. integration_test/
In integration_test/
add lib/
to .gitignore.
Ignore integration_test/test
in analysis-options.yaml
.
dart test
will have a stub which will find this test folder, generate missing files using jnigen
, and runs the tests. (This will spawn another dart test
process, will that be problematic?).
Besides running tests, normal lints and format checks can also be run.
There will always be a minimal set of golden-file bindings in jnigen/test/
so we can still recognize what has changed in generated code. But this will prevent proliferation of such files when we need to add new runtime tests.
When implementing dart-lang/native#669, it's desirable to test both Dart+C and Dart-only bindings.
So what implementation strategy should I go with?
Check in pure-dart and c-based bindings for all tests.
Only check-in few bindings, generate the others on-the-fly.
When implementing dart-lang/native#669, it's desirable to test both Dart+C and Dart-only bindings.
So what implementation strategy should I go with?
- Check in pure-dart and c-based bindings for all tests.
- Only check-in few bindings, generate the others on-the-fly.
Let's focus on "runtime" tests. I think we don't want to duplicate the tests, so we would want to generate the bindings on the fly and test them using the same file. We might even generate the pure-dart and c-based ones in the same path so we don't even have to change the import. Another test could generate the bindings, analyze them, test them with the golden tests (if exist), and run the runtime tests for the generated bindings.
I think there's still one advantage we gain by duplicating tests - consistent reporting. We run dart test
and get result of all tests in a consistent way.
These are some requirements in descending order of priority
We don't want to modify any file that's checked-in to git.
We want intellisense to work when writing tests. But want to avoid analyze errors in CI.
Utilize same config from golden-file tests we have now.
Get output of dart test
- showing number of different tests passed and failed. This will probably require separating runtime tests from dart test
run in jnigen/ folder.
I think the design I proposed above meets most of these requirements. But it's a very unconventional layout, and separates runtime tests from dart test
run in jnigen/ project directory.
One thing I am still not sure about is intellisense. That might need some nested analysis-options trickery. We basically want analysis to happen only if the bindings were generated by the tool.
We don't want to modify any file that's checked-in to git.
Why not? We could remove the C-based bindings committed by default and change it to pure-Dart ones and rerun the runtime tests.
This would solve some of our intellisense issues.
Edit: although we want to use other conditions.
Why not? We could remove the C-based bindings committed by default and change it to pure-Dart ones and rerun the runtime tests.
Because test would alter repository state, which may affect later tests, or cause the altered file to be committed accidentally.
It's useful to think like this: the generated bindings which are checked in must be a "pure function" of config and source which are also checked in.
Edit: This may need clarification. You can revert the changes at the end, but that doesn't help if the test was interrupted, or other golden-file tests running interleaved with bindings tests. I would like to leave any checked-in file unmodified than creating further problems down the line.
How about this:
Check in both type of bindings - this will allow us to see changes in both when doing code generator changes.
Based on existing tests for one type (c_based
), generate other type of bindings at runtime in files named _replicated_test.dart
.
*_replicated_test.dart
We already depend on summarizer being built before tests run, so why dont we wrap these into tool/run_tests.dart
and error on dart test
being run directly?
// dart run jnigen:setup
// generateReplicatedTests()
// dart test, stream results to stdout
// delete replicated bindings
- Check in both type of bindings - this will allow us to see changes in both when doing code generator changes.
That's good.
We can have some pre_test.dart
that simply replicates the given runtime tests, and changes the import path. We can run this in the CI before running dart test
. It's basically like having a copy of the test.
We can have some pre_test.dart that simply replicates the given runtime tests, and changes the import path. We can run this in the CI before running dart test. It's basically like having a copy of the test.
I'd suggest we just wrap all these into single script as above described, and either error on directly invoking dart test
, telling to run tool/run_tests.dart
instead.
The other choice is letting dart test
run agains one (or no) type of bindings, doesn't sound that good to me. But who knows my intuition is often wrong.
I'd suggest we just wrap all these into single script as above described, and either error on directly invoking dart test, telling to run tool/run_tests.dart instead.
Any reason to combine the two?
My reasoning for having a separate steps: dart test
still will run fine, it still tests one of the two bindings, and all the VSCode test plugin integrations still work nicely with the test. The script is there to create the "missing" test from the "existing" one. The coverage gets calculated correctly in the CI.
I actually ended up following the way you suggested, albeit with a few changes.
When dart test
is run without pre-setup, no runtime tests are run. This is because we can have at max 1 JVM, and it needs to have all classpaths. (We can't add classPath in runtime). So generated_runtime_test.dart
takes care of calling all tests defined in different files.
This is kind of similar to how we separated runTests
in JNI.
This means VS code UI for running individual tests doesn't work very nicely, however. (I always use the command line, so never thought about it).
Currently some tests are happening in Dart-only mode and some other are happening in Dart+C. We're only running
dart analyze
on the other mode.