Open EvanMachusak opened 1 year ago
I think it's important to call out here that code in List<Code>
is a different operator than code in ValueSet
, and that there's a really important distinction there. I wonder if it would make more sense to always use the ValueSet
overload, and to support a way to convert a List<Code>
to a ValueSet
, so something like:
define fluent function toValueSet(List<Code>): ValueSet
Then you can 1) always use the optimization of lazy expansion, and 2) ensure that you're always using terminology semantics for the in
operator.
Having said that, there is, by design, no way to construct a ValueSet with the codes that it contains. So let me ask the question, why not have first-class value set definitions for all the lists of codes you want to support?
Because we can implicitly convert a ValueSetRef
to a List<Code>
, it makes List<Code>
the more attractive choice for all APIs because we can either do:
define function Foo(codes List<Code>): ...`
define Bar: Foo("My ValueSet")
define Baz: Foo({ "DRC1", "DRC2" })
The issue I raised here is related to the fact that ExpandValueSet
is an expensive operation. Because there's an implicit conversion, the programmer isn't aware it's happening.
In our SDK, we implemented an implicit conversion between ValueSet
to List<Code>
in such a way that we preserve the fact that the source object is still a ValueSet, e.g. you can think of it like our ValueSet representation implements IList<Code>
so we can use it anywhere that a CQL List<Code>
(represented in .NET as IList<Code>
) is expected. We actually do this with composition and a facade, but it amounts to the same result.
This lets us implement the In
operator for CQL's List<Code>
more efficiently if that List<Code>
is actually a ValueSet.
We currently don't implement ExpandValueSet
the way I think it's intended to be (e.g., you reach out to the terminology server and ask for an expansion). We don't support terminology servers in general at the moment, because HEDIS measures would never allow you to expand your own value sets. We provide the explicit expansion that you must use, and you are not allowed to use any other codes.
Consider this CQL:
When this function is called with a value set, e.g.:
The operand in the ELM to the "code in value set" call is an ExpandValueSet.
The value set does not in fact need to be expanded, because this is valid:
The expression of "example" is an InValueSet.
Eagerly expanding the value set ahead of time introduces a performance penalty for this pattern.
It's possible to declare the parameter as a System.ValueSet which will avoid the ExpandValueSet operand, but because of the automatic expansion of value sets with this usage, these two signatures become ambiguous:
CQL-to-ELM picks the second overload in this case. That's probably correct, but unintuitive.
We use the pattern of passing value sets to functions taking a List of Code often. It's a more flexible signature since it lets us pass either a value set or a manually constructed list of codes.
This is a significant performance penalty when run on the CQF ruler as prematurely expanding value sets which can be large (50k+ rows) results in linear checking of a list of codes instead of an optimized hash lookup of valueset membership.