JamesPHoughton / pysd-tools

0 stars 0 forks source link

Notions of Test Coverage #5

Open JamesPHoughton opened 8 years ago

JamesPHoughton commented 8 years ago

In software development, test coverage is an (albeit imperfect) metric for assessing how well a test suite performs its function of building confidence in a code base. Coverage is taken as the fraction of the lines of code that the test suite exercises. There is of course more to a test suite than just executing every line, but at least you know that if coverage is low, the test suite is underdeveloped.

In SD models, there is a lot of room for improvement in how we use behavioral tests to build confidence in our models. It would be useful to develop a concept analogous to coverage for dynamic models. As in a typical model simulation, all the 'lines of code' are run, we need a different type of metric.

Candidates:

Value Infection Recovery
Susceptible 0 0 -
Total Population 0 0 0
Infected 0 0 0
Contact Rate 0 0 -
Recovery Time inf - 0
Infectivity 0 0 -

In this case, if we had a clear way to define what the rows and columns should be by evaluating the model, we can measure the fraction of all cells that are filled. (Either as a total fraction of the whole or as a fraction of those which are not intentionally excluded as irrelevant with a - dash)

JamesPHoughton commented 8 years ago

We should probably also distinguish tests which are intended to assess the structural (e.g. unit consistency) and dynamic (e.g. behavioral mode replication, conservation conditions, etc.) consistency of a model and serve to ensure that the represented theory is internally coherent, from those which are tests of the theory's correspondence to empirical observation, and may form a separate metric.

JamesPHoughton commented 8 years ago

Calibration as testing: http://www.sciencedirect.com/science/article/pii/S0377221702006227