Remove Java and Javascript harnesses

FiloSottile commented 7 months ago

Now that Wycheproof is community managed (more on that later!), I’d like to propose a significant change: removing the Java and Javascript testing harnesses, and focusing entirely on the JSON test vectors.

One big reason is that we don’t have the Bazel and Java expertise to properly maintain them, and it seems they are in need of maintenance (#83), somewhat undocumented (#84), and designed to run on Google infrastructure that’s not available to us (#86).

More generally, I think the Wycheproof consumption model that established itself in the past years is that of implementations consuming the JSON vectors with harnesses maintained upstream. I want to encourage that: upstream library authors are the best equipped to test their own code.

In particular, in COVID times zero PRs and four issues were about the harness code. Three of the issues are referenced above, and suggest failure to use that code. Only #93 suggests successful use of the harnesses.

The Java and Javascript code is not really just a harness, though: it also implements tests that can’t be represented well with vectors, like timing tests. Those would be lost.

Those harnesses also probably functioned as CI for the test vectors, but I’d rather we build proper GitHub Actions to check that the tests are well-formed, match the schema, and maybe also pass against at least one implementation, just to make sure the tests are not suddenly broken (but not to test the implementation itself).

All things considered, I think the tradeoff is worth it, and we should focus resources into producing, documenting, testing, and facilitating using the vectors instead. I'm looking for feedback on this direction before making the change.

/cc @NeilMadden @jbangert @Sajjon @jvoisin @ascarpino as those involved in issues related to the code to be removed /cc @bleichenbacher-daniel @thaidn @chuckx as previous maintainers /cc @C2SP/stewards

Sajjon commented 7 months ago

A collection of canonical JSON test vector sounds like a good vision for the project!

And if we will go with only one reference implementation - there are far more suitable languages than Java for it! E.g. Rust or possibly Swift - but Rust has far better cryptography ecosystem support.

FiloSottile commented 7 months ago

And if we will go with only one reference implementation - there are far more suitable languages than Java for it! E.g. Rust or possibly Swift - but Rust has far better cryptography ecosystem support.

Heh, well, me being me it would probably be Go :) but again it would just be about making sure a PR doesn't break the tests and cause a bunch of downstream false positives. Tests for the tests, essentially. Doesn't even have to be the same language for all vectors, as long as someone is there to maintain the tests.

bleichenbacher-daniel commented 7 months ago

Heh, well, me being me it would probably be Go :) but again it would just be about making sure a PR doesn't break the tests and cause a bunch of downstream false positives. Tests for the tests, essentially. Doesn't even have to be the same language for all vectors, as long as someone is there to maintain the tests.

One thing I'm currently doing is to keep a small database with test results. Running a test adds an entry to the data base. This allows to generate a list of algorithms with entries such as

"x25519.json": { "type": "Xdh", "alg": { "type": "Xdh", "primitive": "X25519" }, "references": ["RFC 7748"], "test_status": { "pyca": { "test_version": "0.29", "date": "2024-03-16 13:13", "lib_version": "41.0.7", "passed": 236, "skipped": 0, "failed": 0 } } },

bleichenbacher-daniel commented 7 months ago

One of the options to simplify the test environment is to run tests with test vectors through

java/com/google/security/wycheproof/testcases/JsonTest.java

This class takes a file with any test vectors as input and returns a structure with the test results. It is independent of JUnit and the dependencies that are essentially required are the provider to test and a JSON parser. The class allows to generate test files with subsets of the test vectors, so that it is for example possible to rerun just the failing tests. There is also a bit of support for detecting false positives in this class. If no valid test vector in a file passes then it is likely that the test setup is wrong or uses incorrect assumptions and one should check this first before sending out automatic notifications to third parties.

The current setup is to use long lists of test cases like

@Test public void testSecp160k1Sha256() throws Exception { testVerification("ecdsa_secp160k1_sha256_test.json", true); }

@Test public void testSecp160r1Sha256() throws Exception { testVerification("ecdsa_secp160r1_sha256_test.json", true); }

...

This setup is useful if one has tools to monitor the state of test cases over time, but as pointed out makes the tests less flexible and adds more dependencies.

bleichenbacher-daniel commented 3 months ago

I'm currently testing new test vectors against pyca and Rust. That gives about 70% of coverage. When testing against Rust I often struggle with seemingly non-uniform interfaces. Otherwise Rust does indeed support quite a large range of algorithms and primitives.

C2SP / wycheproof

Remove Java and Javascript harnesses #105