Investigate the possibility to test compliance of a Python engine

baldimir commented 2 years ago

There are Python DMN engines, so it would be good to be able to test compliance of such engines too. Investigate the possibility.

StrayAlien commented 2 years ago

Perhaps the test runner should be standalone code that communicates with the engine-under-test via http. So, the engine-under-test would need to spin up a small http server and accept incoming standardised models & requests, and provide a response.

I guess we'd need to establish what that request/response protocol looked like as it'd need to take into account providing a 'package' of models to test 'imports', not just standalone models.

baldimir commented 2 years ago

There is this, which may be used https://jpy.readthedocs.io/en/latest/

ghost commented 1 year ago

This is exactly the same case like in DMNTK (Rust). DMNTK engine loads all compatibility tests and exposes endpoints to evaluate them (there is no difference if one, two or hundreds of models are loaded). Then, the separate test runner (separate application) reads the test cases and evaluates them, just like production deployed models. It happens all over HTTP using JSON API. For DMNTK we have implemented such protocol like the one mentioned by @StrayAlien. It is currently based on test file schema. If you think it could be a good starting point, then I can share a JSON schema for this protocol and some explanations how it is done in DMNTK.

StrayAlien commented 1 year ago

Hi @dmntk , first of all, congratulations and well done on the Rust impl of DMN. Something I have long been considering doing. It ain't easy I imagine. I've done a Java and a Javascript (Typescript) impl and have really found the spec to be rather, well, Java centric. Date handling and BigDecimal in particular. I had often wondered whether the date/time/duration handing in Rust was capable. Seems maybe it is. Nice. But, that is another conversation. Nice to meet you, even by way of github.

Expressing an opinion here, but I think if we created a 'test runner' that invoked a runtime via a standardised (http) API spec then we'd only ever need one test runner for all languages and have a single way to test and report compliance.

I was imagining the API would accept the test model(s) also execute tests against it.

You mention JSON? The spec is XML Also, the 'standardarised' API would need to handle multiple models being transmitted to handle 'imports'. For the sake of naming things, maybe let's call it a 'package'. I had a quick stab a getting multiple XML models into a single XML 'package' file and it proved troublesome. Namespace collisions, id collisions. It was hard to verify with the schema. Etc.

I ended up thinking that multi-part mime might be the most flexible way to handle transmission of multiple models. Either that or a encoded zip. I'm not sure either are very tasteful, but given the schema complications of jamming multiple XML model definitions and associated namespaces into a single textual XML, other ways have to be considered. I say XML as we have no established standard for the representation of a DMN model as JSON.

That is, unless you are considering the API does not involve transmission of models, just requests and responses.

At any rate, glad to kick of discussions. And again, well done on DMNTK. Btw, It's great you're publishing your metrics.

ghost commented 1 year ago

Hi @StrayAlien, thanks for your congratulations, there is a long way behind and there is still a long way ahead before production readiness (current version is 0.2.0, so I may say 20% is ready ;-)). Anyway, thanks again!

Few versions ago in DMNTK, I had already such implementation as you have decribed above. I was sending testing models to execution engine, deployed them and then tested with the separate runner via HTTP and JSON. Sending tested models was as simple as converting XML (it is just some text ;-)) to Base64 encoded string and POST'ing them as JSON payload. I have used this API for over two years during development.

But currently I dropped this solution for few reasons:

security - when someone forgets to block such endpoint on production, then everyone may deploy any number of own models,
performance - sending and deploying models takes more time than just loading them during server startup,
rustification - Rust prevents many "Java natural" operations on data - so the code was too complicated to maintain REST deployable models (I was dreaming about turning back to Java then ;-)).

So, current solution in DMNTK is very simple (will be officially available in version 0.3.0):

you start the execution engine pointing a directory with test cases, like ~/tck/TestCases,
the engine loads all models (*.dmn files) and is ready for testing,
test runner reads the test cases from all *.xml files and evaluates them.

The protocol I had in mind is the JSON format for transferring invocable's input data from test runner to execution engine and of course getting back the results.

So the general rule is:

the execution engine is responsible for loading tested models and exposing dedicated endpoint for testing,
the test runner with standarized protocol tests (evaluates) all deployed invocables.

With such approach, we could test any execution engine with a single test runner, that generates unified stats for all vendors ;-).

Anyway, this is just the way we did it in DMNTK, it may be useful for other engines (or not), but I guess, it is always a good thing to share some ideas ;-)

StrayAlien commented 1 year ago

Hi @dmntk , I had figured that is the approach you might be taking if not sending models. I've no arguments with it and seems reasonable. Though, we're not all Rustaceans (yet!) maybe an on-development of the existing test runner might be more accessible? Though, Tbh, I've never really looked at the existing Java test runner. I have my own, but I think a standardised one is the right thing to do as more languages seem to come on board.

ghost commented 1 year ago

The common runner does not have to be written in Rust. Just the common API must be agreed between DMN engine implementations, but this is out of the scope of this discussion I guess. ;-)

dmn-tck / tck

Investigate the possibility to test compliance of a Python engine #515