Dynamic or relative dates - relative to run time

gowthamrao commented 6 years ago

Currently it is possible to specify cohort definitions such as procedure_start_date on or or '2017-01-01 and procedure_start_date before '2017-03-31'`

We would like to specify a dynamic date that is relative to run time procedure_start_date on or or '(current_date - 180 days) and procedure_start_date before (current_date - 30 days'

For business intelligence purposes, we routinely track if a certain event (condition, procedure, measurement, drug exposure) has happened during a rolling window period relative to date of analysis. Each time we have to update the reference dates that are in Atlas, because the dates are 'hard coded'. By changing to relative dates -- this use-cases becomes easier.

gowthamrao commented 6 years ago

@chrisknoll is this allowed by OHDSI standards?

chrisknoll commented 6 years ago

Hi, @gowthamrao , I'm sorry I haven't replied to this, It's been something I've been thinking through, and haven't really settled on a position. To answer your question: I don't think there's a position on whether something like this should be allowed or not allowed by OHDSI standards.

But here's my thoughts: When I think about a cohort definition, I'd like it to be deterministic. Ie: you run the same expression on the same data-source, it should give you the same cohort every time. This is important for study reproducibility, but also the sanity of the researcher that their numbers don't change just because they re-generated results a few months later. What you are describing is non-deterministic. Since it's a conscious decision on the part of the expression author, then I think I can live with the capability in the circe expression to support it, even tho I worry about the issues it might raise in using it (ie: someone didn't understand that the date was at the time of execution, and not at the time of authorship, for example).

To your comment about needing to go into Atlas to make a change: the beauty of extracting circe-be into it's own library is that you can decide how to construct the expression to give you the sql to execute it. You can do some custom find-replace on elements of the JSON and then use circe-be to generate the sql that will build the expression for you. The only role in Atlas is to produce a cohort 'template' for you. IE: You go into atlas and create your expression, and get this for your procedure criteria:

      {
        "ProcedureOccurrence": {
          "OccurrenceStartDate": {
            "Value": "2018-01-02",
            "Op": "lt"
          }
        }

But you can export that JSON file and replace certain elements like so:

      {
        "ProcedureOccurrence": {
          "OccurrenceStartDate": {
            "Value": "@@currentDate",
            "Op": "lt"
          }
        }

And in your own code you'd read the JSON, replace the @@currentDate to the value you want, and then get the SQL from circe-be.

I think this might sound like I'm pushing all the work back onto you, and I'm happy to work on changes that introduce a new criteria that is a 'current time offset' criteria that might be useful for you. But I am also trying to promote the idea that circe-be can be used independently of Atlas. You can have your own business processes uses circe-be to identify the population and do some custom work with the results of the query. This way, you don't need to wait for atlas to catch up with your needs, you can build your own business processes as you need them. I'm guessing that in order to support your business process, you'd have to take the results of the prior execution from Atlas (because the cohort definition as a fixed ID, so before you can run the second time you need to copy the results from the prior run and overwrite with the next atlas generation) Or perhaps just generate in atlas, summarize statistics, repeat as needed. But something must be capturing the independent invocations of your process. I'm suggesting that you could wrap that entire process into your own workflow independent of atlas, and this might be more flexible to your needs.

OHDSI / circe-be

Dynamic or relative dates - relative to run time #17