FHIR / sql-on-fhir-v2

This project provides the source for the SQL on FHIR v2.0 Implementation Guide
https://build.fhir.org/ig/FHIR/sql-on-fhir-v2/
MIT License
92 stars 26 forks source link

Add test for column ordering #228

Closed johngrimes closed 2 months ago

johngrimes commented 4 months ago

This change adds a new test that verifies the column ordering logic in the specification, for implementations that support ordering of columns in the result.

It's basically just the example in the specification in test form:

{
  "title": "column ordering",
  "view": {
    "resource": "Patient",
    "select": [
      {
        "column": [
          {
            "path": "'A'",
            "name": "a",
            "type": "string"
          },
          {
            "path": "'B'",
            "name": "b",
            "type": "string"
          }
        ],
        "select": [
          {
            "forEach": "name",
            "column": [
              {
                "path": "'C'",
                "name": "c",
                "type": "string"
              },
              {
                "path": "'D'",
                "name": "d",
                "type": "string"
              }
            ]
          }
        ],
        "unionAll": [
          {
            "column": [
              {
                "path": "'E1'",
                "name": "e",
                "type": "string"
              },
              {
                "path": "'F1'",
                "name": "f",
                "type": "string"
              }
            ]
          },
          {
            "column": [
              {
                "path": "'E2'",
                "name": "e",
                "type": "string"
              },
              {
                "path": "'F2'",
                "name": "f",
                "type": "string"
              }
            ]
          }
        ]
      },
      {
        "column": [
          {
            "path": "'G'",
            "name": "g",
            "type": "string"
          },
          {
            "path": "'H'",
            "name": "h",
            "type": "string"
          }
        ]
      }
    ]
  },
  "expectColumns": [
    "a",
    "b",
    "c",
    "d",
    "e",
    "f",
    "g",
    "h"
  ],
  "expect": [
    {
      "a": "A",
      "b": "B",
      "c": "C",
      "d": "D",
      "e": "E1",
      "f": "F1",
      "g": "G",
      "h": "H"
    },
    {
      "a": "A",
      "b": "B",
      "c": "C",
      "d": "D",
      "e": "E2",
      "f": "F2",
      "g": "G",
      "h": "H"
    },
    {
      "a": "A",
      "b": "B",
      "c": "C",
      "d": "D",
      "e": "E1",
      "f": "F1",
      "g": "G",
      "h": "H"
    },
    {
      "a": "A",
      "b": "B",
      "c": "C",
      "d": "D",
      "e": "E2",
      "f": "F2",
      "g": "G",
      "h": "H"
    }
  ]
}

Resolves #225.

johngrimes commented 4 months ago

This test fails for the reference implementation for a reason that I don't yet understand:

error: expect(received).toEqual(expected)

  [
    {
-     a: "A",
-     b: "B",
      c: "C",
      d: "D",
      e: "E1",
      f: "F1",
      g: "G",
      h: "H",
    },
    {
-     a: "A",
-     b: "B",
      c: "C",
      d: "D",
      e: "E2",
      f: "F2",
      g: "G",
      h: "H",
    },
    {
-     a: "A",
-     b: "B",
      c: "C",
      d: "D",
      e: "E1",
      f: "F1",
      g: "G",
      h: "H",
    },
    {
-     a: "A",
-     b: "B",
      c: "C",
      d: "D",
      e: "E2",
      f: "F2",
      g: "G",
      h: "H",
    }
  ]

- Expected  - 8
+ Received  + 0
johngrimes commented 4 months ago

Still to do:

awalley-ncqa commented 4 months ago

Hi John, This may be a similar issue (maybe not): I was finding discrepancies in some unit tests (for column ordering) in the reference engine.

I said a little bit about it here.

Basically, I took some a few views from unit tests and placed them in the sof playground and received different column ordering results than the unit tests. Examples can be seen with union.json -> "basic" and union.json ->"unionAll + column".

I am not sure if this will help or be different, but this is how I am currently doing column ordering:

I recursively add all columns to a blank data table (just to store the order), placing column first, select second, and unionAll last. Not sure what your thoughts are/if this is correct.

EDIT: I ran this proposed unit test and got the correct ordering with the column ordering code below

 DataTable
 GetViewOrder(Arena *arena, View *view)
 {
  switch (view->type)
  {
   case ViewType::Union:
   case ViewType::Select:
   {
    for (View* v = view->column_first; v; v = v->next)
    {
     DataTable v_list = GetViewOrder(arena, v);
     res.AddAllColumnsWithoutValues(arena, v_list);
    }

    for (View* v = view->select_first; v; v = v->next)
    {
     DataTable v_list = GetViewOrder(arena, v);
     res.AddAllColumnsWithoutValues(arena, v_list);
    }

    for (View* v = view->union_first; v; v = v->next)
    {
     DataTable v_list = GetViewOrder(arena, v);
     res.AddAllColumnsWithoutValues(arena, v_list);
    }
   } break;
   case ViewType::Column:
   {
    DataColumn col = {};
    col.name = view->name.str8;
    col.value_type = view->data_type;
    res.AddColumnWithoutValues(arena, col);
   } break;
  }

  return res;
 }

 DataTable
 GetColumnOrder(Arena *arena, native_fhir::ViewDefinition vd)
 {
  DataTable res = {};
  for (View *view = vd.first; view; view = view->next)
  {
   DataTable view_order = GetViewOrder(arena, view);
   res.AddAllColumnsWithoutValues(arena, view_order);
  }

  return res;
 }
johngrimes commented 4 months ago

Hi @awalley-ncqa,

I might need a bit more time to digest your code example, but I just wrote you a response in Zulip explaining the semantics of the tests.

https://chat.fhir.org/#narrow/stream/179219-analytics-on-FHIR/topic/Column.20Ordering/near/434276053

awalley-ncqa commented 4 months ago

Thanks john for your response on Zulip, that should fix the issues I was facing! As for the code I posted above, hopefully this is more readable and doesn't contain extra code-base artifacts. Pseudo-Code:


Column[] GetColumns(elem)
{
  Column[] result = [];

  if(elem is "column")
  {
    result.add(elem)
  }
  else if(elem is "select" or elem is "unionAll")
  {
    foreach  _c_ in elem.column: 
        children = AddElement(_c_)
        result.AddAll(children)

    foreach _s_ in elem.select: 
       children = AddElement(_s_)
       result.AddAll(children)

    foreach _u_ in elem.unionAll:
        children = AddElement(_u_).RemoveDuplicates()
        result.AddAll(children)
   }

  return result;
}

Column[] GetColumnList(ViewDefinition vd)
{
  Column[] result = [];
  foreach `select` _s_ in vd:
    children = GetColumns(_s_)
    result.AddAll(children)

  return result;
} 
johngrimes commented 2 months ago

I think this PR is ready to go, just needs a review.

johngrimes commented 2 months ago

I think I am going to merge this now, this branch has lived a good life but if it lives any longer we are going to enter merge hell.