CQL Unit Testing - Githubissues

JPercival commented 2 years ago

EDIT: Updated based on discussions below

The language server should support CQL unit tests. The first step is to define what a unit test looks like in CQL. We propose adding support for some @tags to support specifying unit tests and their input requirements in CQL

Tag	Value	Description
`@test`	N/A	If specified on a Library, marks it as a test suite. If specified on a definition, marks it as a test
`@parameter`	Name CQLValue	set input parameters for Libraries
`@asof`	CQLDatetime	evaluate as of a specific date
`@context`	Context=Value	the value of the context
`@data`	Path	source directory for test data
`@terminology`	Path	source directory for test terminology
`@mock`	Definition=Value	(future work) specifies a mock value for a CQL definition
`@parameterized`	`<tag>`	(future work) specifies that a test should be repeated with a different set of inputs
`@dataprovider`	ExpressionReference	(future work) supplies a set of data used to run tests
`@ignore`	N/A	(future work) Report the results of this test, but don't fail the overall suite if this fails

An example of a test Library using these tags:

// @test
// @parameter: "Measurement Interval" [@2019,@2020]
// @asof: @2020-10-01
// @context: Patient=654
// @terminology: tests/vocab
library DQMTest

include DQM as targetLibrary
include org.opencds.cqf.cql.Asserts version '1.0.0'
include TestHelpers

// @test
// @data: tests/data
// @context: Patient=123
define "In Initial Population":
  assert(targetLibrary."Initial Population").isTrue()

// @test
// @data: tests/data
// @context: Patient=123
define "Has Required Attributes":
   TestHelpers.HasRequiredAttributes(Patient)

Evaluating a definition marked with @test and getting a Message with the severity level of Error is a test failure.

http://cql.hl7.org/09-b-cqlreference.html#message

@test is required to use the other tags. If any are present without @test, it's an error condition. IOW, these tags are only allowed on unit test libraries.

Any @tag defined at the Library level sets the default @tag value for the set of tests within the Library. @tag values on a definition override the default Library @tag.

Each @test definition is evaluated independently and with the appropriate input parameters as defined by the merged definition and library @tag values. It's up to the test runtime environment to optimize data or context caching to speed up test execution.

If during the evaluation of a test, the CQL definition references any other CQL definitions that are also marked as @test (or with any other of the proposed tags), those @tags are ignored. Only the @tags of the entry point apply. That is to say, during the evaluation of a test the context, input parameters, terminology, or data may not change mid-evaluation.

CQL Test libraries SHOULD NOT be shipped as part of an executable package, such as a FHIR npm package.

The example org.opencds.cqf.cql.Asserts library does not exist at the time of this writing. The contents would be a set of helper functions to assist in specifying expected results for tests.

JPercival commented 2 years ago

A few more questions/considerations:

CQL supports multiple context definitions within a single Library. Should this be represented as repeated @context annotations? Should it be a list of values instead? How does that play with the proposed @parameterized tag?

// @test
// @context Patient=1
// @context Encounter=2
define "Whatever": true

CQL supports using multiple data models, though that is not common. Should the @data tag support that?

using FHIR version '4.0.1'
using QDM version '4.6'

// @test
// @data FHIR=path/to/fhir/data
// @data QDM=path/to/qdm/data
define "Foo": "Bar"

Should unit test libraries be bundled and shipped with CQL packages? Typically test libraries are excluded when shipping software. What should the standard directory structure be for separating tests from code?

JPercival commented 2 years ago

Can we do this? @Data evaluates the expression and makes the returning list of objects the data source for the test.

In theory, yes, but it significantly complicates the implementation of the test framework. The reason is that you end up making multiple passes across the CQL to fully resolve all the inputs. It's conceptually the same reason why Java/C# only allow static values for annotations/attributes.

vitorpamplona commented 2 years ago

I think it could be feasible. TestNG has a similar scheme:

public class DataProviderTest
{
    private int param;

    @Factory(dataProvider = "dataMethod")
    public DataProviderTest(int param) {
        this.param = param;
    }

    @DataProvider
    public static Object[][] dataMethod() {
        return new Object[][] { { 0 }, { 1 } };
    }

    @Test
    public void testMethodOne() {
        int opValue = param + 1;
        System.out.println("Test method one output: " + opValue);
    }

    @Test
    public void testMethodTwo() {
        int opValue = param + 2;
        System.out.println("Test method two output: " + opValue);
    }
}

Outputs:

Test method one output: 2
Test method one output: 1
Test method two output: 3
Test method two output: 2

PASSED: testMethodOne
PASSED: testMethodOne
PASSED: testMethodTwo
PASSED: testMethodTwo

vitorpamplona commented 2 years ago

Maybe we borrow the @DataProvider tag?

There can only be 1 data provider, but it can return many datasets.

library VitorTest

include DQM
include org.opencds.cqf.cql.Asserts version '1.0.0' as assert 

define "Patient1": Patient(Identifier: 1, Name: 'Patrick', DOB: @2014-01-01)   
define "Observation1": Observation(Subject: Patient1, ...)
define "Observation2": Observation(Subject: Patient1, ...)
define "Observation3": Observation(Subject: Patient1, ...)

// @DataProvider
define "setup": 
  { 
    { Patient1, Observation1, Observation2 }               // 1st run
    { Patient1, Observation1, Observation3 }               // 2nd run
    { Patient1, Observation1, Observation2, Observation3 } // 3rd run
  }  

// @test
define "AtLeast2Obs":
  assert.GreaterThan(Count(DQM.PatientObs), 1)

JPercival commented 2 years ago

Also, given fluent functions, the following is possible:

library Test

include DQM
include org.opencds.cqf.cql.Asserts version '1.0.0' as assert 

// @test
define "AtLeast2Obs":
  Count(DQM.PatientObs).GreaterThan(2)

More of an observation on style than anything else. I kinda don't like it.

vitorpamplona commented 2 years ago

Also, given fluent functions, the following is possible:

I like fluent, but it should be this:

assert(Count(DQM.PatientObs)).isGreaterThan(2)

cmoesel commented 2 years ago

Or more of a Jest style:

define "AtLeast2Obs":
  expect(Count(DQM.PatientObs)).toBeGreaterThan(2)

cmoesel commented 2 years ago

Ha. @vitorpamplona and I were thinking along the same lines!

rob-reynolds commented 2 years ago

FWIW, I prefer "assert" over "expect".

rob-reynolds commented 2 years ago

And definitely prefer having an "assert" over not having one, fluent or not.

vitorpamplona commented 2 years ago

Looks like @JPercival hit a nerve... :)

JPercival commented 2 years ago

Hmm... Not sure how expect or assert would work as you've described there, given you can't define a function on objects in CQL. Maybe this?

define function assert(value : System.Any):
     Tuple { value : value }

// @fluent
define function isGreaterThan(actual: Tuple, expected: System.Integer):
  if not actual.value > expected then
    Message(Error, "expected {}, found {}")
  else
    true
  end

JPercival commented 2 years ago

My only concern with @dataprovider is that it makes the test runtime more complex. Would you be ok with putting that on the backlog for future enhancements? I can see the utility, but knowing what I know about the cql-engine and so on I think this almost doubles the complexity.

rob-reynolds commented 2 years ago

You can greatly reduce the testing helpers if you do:

assert.True(Count(DQM.PatientObs) > 2)

You can cover most everything with just True, Equals, Null (and their negations, for clarity).

rob-reynolds commented 2 years ago

@ignore

vitorpamplona commented 2 years ago

Would you be ok with putting that on the backlog for future enhancements?

Of course. I am just trying to make it as simple as possible for users.

My only concern with @dataprovider is that it makes the test runtime more complex.

Agree. But I am not sure by how much. It looks like a simple call before the expression but I might be wrong.

JPercival commented 2 years ago

Agreed that True, Null, Equals cover most cases. We also need Equivalent since that's a concept unique to CQL. Pretty easy to add the negations as well, just for the sake of readability for authors.

assert.True(Count(DQM.PatientObs) > 2)

This would actually be (if we used the fluent API for assertions described by Vitor and Chris):

assert(Count(DQM.PatientObs) > 2).isTrue()

rob-reynolds commented 2 years ago

Yeah, I started to add the fluent version but thought that'd muddy it up.

rob-reynolds commented 2 years ago

Not sure how I feel about this, but, at the time, we just unleashed some QA folks to retro add unit tests for CQL... they came up with the following convention:

// @test // @parameter "Measurement Interval" [@2019,@2020] // @asof @2020-10-01 // @context Patient=654 library DQMTest

include DQM as targetLibrary include org.opencds.cqf.cql.Asserts version '1.0.0' as assert

// @test // @source tests/data // @context Patient=123 define "AtLeast2Obs": assert(Count(targetLibrary.PatientObs) > 2).isTrue()

rob-reynolds commented 2 years ago

Can't color so I'm not sure it's clear.

As a tenet, each test library should only test a single library so they liked "aliasing" the "system under test" as "targetLibrary" for consistency, I guess.

rob-reynolds commented 2 years ago

I'm probably going to regret this, but can we kill the @context at the library level?

rob-reynolds commented 2 years ago

I understand why it's there, but it's stinky, right?

rob-reynolds commented 2 years ago

Let's just say no?

JPercival commented 2 years ago

Could you elaborate why it's stinky? I'm not sure I follow.

rob-reynolds commented 2 years ago

Ideally each test has it's own distinct context.

But I get we don't want to be tyrants about that. If you can find reuse and avoid the pitfalls, great. But maybe we make you be explicit about doing that and don't allow you to say "they're all using the same context".

It would also simplify the "local (test) context takes precedence over global (library) context" explanation.

JPercival commented 2 years ago

It would also simplify the "local (test) context takes precedence over global (library) context".

So, all the tags represent inputs, right? They all affect the state of the evaluation. Local terminology, local data, local parameters, etc. In my view, the context is just "one more thing" that affects the state. I don't see how it's materially different than @parameter, but I'm happy to have that discussion.

rob-reynolds commented 2 years ago

I don't have a solid litmus for it. That's why I used "stinky".

In CQL parameters are set at the Library level. So allowing them to be global in the test seems fine. And narrowing... no problem.

But there is a tenet that each test should have it's own state. At least the "context" part of that state. @context restricted to local supports that.

That's all I got.

rob-reynolds commented 2 years ago

Totally fine if that's not enough. :)

rob-reynolds commented 2 years ago

It would also simplify the "local (test) context takes precedence over global (library) context" explanation.

Not true. I see all the tags are already consistent in this regard so... nm.

vitorpamplona commented 2 years ago

Should every test run measure the time spent on the test to report back by the tool?

vitorpamplona commented 2 years ago

CQL supports multiple context definitions within a single Library. Should this be represented as repeated @context annotations? Should it be a list of values instead? How does that play with the proposed @parameterized tag?
// @test
// @context Patient=1
// @context Encounter=2
define "Whatever": true

I would prefer something like @context { { Patient=1, Encounter=2 }, { Patient=2, Encounter=3 } }. The outside list is for multiple runs, the inside is for multiple arguments in the same run.

CQL supports using multiple data models, though that is not common. Should the @data tag support that?
using FHIR version '4.0.1'
using QDM version '4.6'

// @test
// @data FHIR=path/to/fhir/data
// @data QDM=path/to/qdm/data
define "Foo": "Bar"

This one is hard. Following 1, would this be @data { { FHIR=path/to/fhir/data },{ QDM=path/to/qdm/data } }, meaning it runs first with FHIR and then with QDM?

What happens if we have

using FHIR version '4.0.1'
using QDM version '4.6'

// @test
// @data { { FHIR=path/to/fhir/data },{ QDM=path/to/qdm/data } }
// @context { { Patient=1, Encounter=2 }, { Patient=2, Encounter=3 } }
define "Foo": "Bar"

Does this mean 4 runs:

FHIR-Pat1-Enc2
FHIR-Pat2-Enc3
QDM-Pat1-Enc2
QDM-Pat2-Enc3

or 2 runs?

FHIR-Pat1-Enc2
QDM-Pat2-Enc3

Is it common to have same IDs between QDM and FHIR structures?

vitorpamplona commented 2 years ago

Should unit test libraries be bundled and shipped with CQL packages? Typically test libraries are excluded when shipping software. What should the standard directory structure be for separating tests from code?

Unit tests should not be shipped to production systems to reduce the payload (security review, etc). But I do think we should incentive every CQL author to distribute tests for each version of their libraries. It helps to spec out the workable data objects without having to spec them out, facilitating communication between users and library providers.

Thus, having a common directory structure with very easy ways to download and test any library out there is a big win. Maybe this even gets to the beginnings of an interoperability testing website.

rob-reynolds commented 2 years ago

CQL supports using multiple data models, though that is not common. Should the @data tag support that?
using FHIR version '4.0.1'
using QDM version '4.6'

// @test
// @data FHIR=path/to/fhir/data
// @data QDM=path/to/qdm/data
define "Foo": "Bar"

Personally, I think doing this now would be over-engineering and we should wait until we have an actual use-case to drive it. We could have the docs state that, technically, CQL can do this, but we don't have an actual use-case at this time, and please contact us if you have one.

I'm thinking that, because the data between data models is shaped differently, an actual implementation may not even include something like the example. And until we have an real-world example we're just making stuff up.

rob-reynolds commented 2 years ago

Should unit test libraries be bundled and shipped with CQL packages? Typically test libraries are excluded when shipping software. What should the standard directory structure be for separating tests from code?

I agree with everything @vitorpamplona said. And this doesn't, necessarily, conflict with anything he said, but there's a tenet to keep unit test files separate from system-under-test files. I think we should adhere to it.

If we're agreeing that they shouldn't ship with the software, to me, that means outside the IG's input folder.

I'm also suggesting that in the existing structure we currently have, (which is a tests folder that is a sibling of the cql folder), the content of the existing tests folder should be for the IG, not necessarily for the CQL. Which is to say I think they're currently being slightly misused, if you will, because the expectation is that you right-click in the CQL file and the tests run. There's nothing inherently wrong with that other than I don't think it aligns with what the intent of those should be which is to be run against the entire IG, not just the CQL. And as an aside, I think those should ship with the IG. Although I also think the current situation lacks a proper "executable package" of the IG that would not include those tests. Different topic though.

That said, sibling to input, test/input/cql/test<blah>.cql.

vitorpamplona commented 2 years ago

Are CQL developers bound by IGs? I was imagining the "publication" being discussed here as a standalone CQL lib, similar to FHIRHelpers or the org.opencds.cqf.cql.Asserts itself. CQL libraries could be made available independently of any FHIR context. I am not sure what are the guidelines for those.

But we should find a place for them inside the IGs as well.

rob-reynolds commented 2 years ago

Great point. My experience is by far the vast majority of CQL is in an IG, but I definitely didn't mean to preclude CQL outside an IG.

Please apply non IG counterparts to what I posted, as applicable (I think most of the points still stand).

JPercival commented 2 years ago

Are CQL developers bound by IGs?

Please apply non IG counterparts to what I posted, as applicable (I think most of the points still stand).

There's currently no well-defined structure for CQL "projects" independent of IGs. We could / should come up with a structure of what we think that looks like. Ideally, compatible with the current IG structure.

JPercival commented 2 years ago

This one is hard. Following 1, would this be @data { { FHIR=path/to/fhir/data },{ QDM=path/to/qdm/data } }, meaning it runs first with FHIR and then with QDM?

Not quite. CQL allows the usage of multiple Models at the same time. The following is valid:

library Test

using FHIR version  '4.0.1'
using QDM version '4.6'

define "QDM Patients":
      [ QDM.Patient ]

define "FHIR Patients":
    [FHIR.Patient]

As Rob points out, this is very uncommon.

JPercival commented 2 years ago

Also, there's a user called @daTa that we've been pinging, apparently. Sorry DaTa! You can use the backticks to avoid that. :)

JeffBrown-lantana commented 1 year ago

Copied the link to this Github discussion over to Zulip stream: https://chat.fhir.org/#narrow/stream/338083-Da-Vinci-CQL-Testing

cqframework / cql-language-server

CQL Unit Testing #40