Unit tests for JenaQuerierDB are needed

What is needed

At present, unit tests exist for the SystemModelQuerier and SystemModelUpdater classes, which read and write directly to the Jena triple store by constructing and executing SPARQL queries. However, there are no unit tests for the JenaQuerierDB, which interacts with the triple store via the Jena API, uses a set of deserialization classes for system (and domain) model entities, and supports caching of these objects to avoid the overhead of frequent access to the triple store.

In almost all situations, JenaQuerierDB instances are used with the caching feature enabled. This makes it quite difficult to test the triple store access functions. One can create an entity (e.g., a Threat), represented as a deserialized ThreatDB object, then use a JenaQuererDB instance to store it, but the JenaQuererDB object just adds the ThreatDB object to a cache, and returns the same object when requested in a subsequent read access.

What this means is that updates are applied to the cached entity objects, e.g., to set the likelihood of a threat one uses a method on the ThreatDB object. Since this object is in the cache, all the JenaQuererDB object does is to add the ThreatDB object to the list of new and modified objects in the cache. There is a separate sync() method that serializes new and modified objects in the cache back to the triple store, so really we just need to test that mechanism to check serialization and deserialization functionality.

There is some subtlety regarding the use of different system model graphs. The asserted graph contains user/client asserted assets and relationships, and is in practice updated via the client API which uses the SystemModelUpdater class. The inferred graph contains other entities (including some assets and relationships) added by the validator, plus likelihood and risk levels and causation relationships from the risk calculator. In some situations, an entity may be split between the two graphs:

An Asset may be asserted, but be the source of some inferred relationships: the AssetDB entity is an asserted graph resource, but some of its properties (those referencing target assets for inferred relationships) will be specified in a triples from the inferred graph.
An Asset may be inferred, but be the source of some asserted relationships: the AssetDB entity is an inferred graph resource, but some of its properties (those referencing target assets for inferred relationships) will be specified in a triples from the inferred graph.
An asset behaviour may have an asserted impact level: the MisbehaviourSetDB entity is an inferred graph resource, whose impact level is a property specified in a triple from the asserted graph.
An asset trustworthiness attribute may have an asserted assumed TW level: the TrustworthinessAttributeSetDB entity is an inferred graph resource, whose assumed TW level is a property specified in a triple from the asserted graph.
An asset control may have an asserted coverage level and/or implementation status: the ControlSetDB entity is an inferred graph resource, whose coverage level and/or status are properties specified in a triple from the asserted graph.

When an entity or set of entities is read using a JenaQuerierDB object, the argument list includes strings referring to the required graphs. An EntityDB object is returned for each requested entity that has any properties specified in those graph(s). EntityDB member variables corresponding to properties not defined in the requested graph(s) will be null, except the entity URI and type which are always set in the returned object. If one graph is specified, one gets all entities from that graph and some from the other graph. If both graphs are used, the JenaQuerierDB object still returns one object with variables set based on properties from either graph.

Proposed unit tests

Taking these aspects into account, it is proposed that the following tests should be used:

[ ] Instantiate a JenaQuerierDB object with caching disabled for a simple, pre-validated test case with some assumed TW levels, misbehaviour impact levels and control coverage levels and status specified in the asserted graph. Use JenaQuerierDB.init() to load the model, then test as follows:
- load objects from the asserted graph only, check that the asserted risk calculation inputs are obtained
- load objects from the asserted and inferred graphs only, check that the asserted risk calculation inputs are obtained
- load objects from the inferred graph only, check that the asserted risk calculation inputs are not obtained
- use asserted graph update methods to modify some of the asserted risk calculation inputs and override some of the default values used elsewhere
- load objects from the asserted graph only, check that the modified asserted risk calculation inputs are obtained
- load objects from the asserted and inferred graphs only, check that the modified asserted risk calculation inputs are obtained
- load objects from the inferred graph only, check that the modified asserted risk calculation inputs are not obtained
[ ] Same as above but with caching enabled.
[ ] Same again with caching enabled, but this time use JenaQuerierDB.sync() after updating the risk calculation inputs, then create a new JenaQuerierDB object, use JenaQuerierDB.init() to load the model, and check that the correct values are returned by the the new JenaQuerierDB object.

These tests check the methods designed to modify risk calculation inputs work as expected, which means they can be used to update models prior to risk calculation, when (a) values must be altered to ensure population triplets are consistent, or (b) risk calculations are used in 'what if' scenarios, e.g., by the Control Strategy Recommender algorithm.

The last test also checks that JenaQuerierDB.sync() writes data to the triple store as expected.

Then further tests could be added to check that the JenaQuerierDB initialisation methods work as expected. To do this, it is proposed to add an argument to the Validator.validate() method, controlling whether results are serialized to the triple store using the sync() method at the end. This would emulate the approach used in the RiskCalculator.calculateRiskLevels() method. The following tests could then be used:

[ ] Instantiate a JenaQuerierDB object with caching enabled for a simple asserted model. Call JenaQuerierDB.initForValidation() and pass to a Validator instance, and run the Validator.validate() method with sync() enabled. Then create a new JenaQuerierDB object with caching enabled, call JenaQuerierDB.initForRiskCalculation(), pass it to a RiskCalculator instance and then run the RiskCalculator.calculateRiskLevels() method with sync() enabled. This test replicates the original default sequence as invoked via the client API, and creates an output model that can be used as a reference.
[ ] Repeat this test, but using the first JenaQuerierDB object for both validation and risk calculation, with caching enabled in the JenaQuerierDB object, and sync() used in the validation stage.
[ ] Repeat this test, using the first JenaQuerierDB object for both validation and risk calculation, with caching enabled in the JenaQuerierDB object, and no sync() in the validation stage.
[ ] Repeat this test, using the first JenaQuerierDB object for both validation and risk calculation, with caching disabled in the JenaQuerierDB object, and no sync() in the validation stage.

In each of these tests, the output model should be checked against results obtained by manually running the model. Ideally, this should be done by comparing the whole model, e.g., by using a canonical serialization as JSON files, and using diff between the resulting files. A less stringent but probably still sufficiently sensitive test would be to compare the likelihood levels for specific Misbehaviour Sets after the risk calculation against known values from a manual analysis of the same system model.

These tests check that the JenaQuerierDB initialisation methods work as expected, and that there is no interference between those methods and the cache synchronization method.

Spyderisk / system-modeller

Unit tests for JenaQuerierDB are needed #141

What is needed

Proposed unit tests