FHIR / sql-on-fhir-v2

This project provides the source for the SQL on FHIR v2.0 Implementation Guide
https://build.fhir.org/ig/FHIR/sql-on-fhir-v2/
MIT License
92 stars 26 forks source link

Semantic Clarification for Result of Multiple Selects #227

Closed awalley-ncqa closed 1 month ago

awalley-ncqa commented 4 months ago

Under the multiple selects section of the documentation it says that

"Sibling selects are effectively cross joined, where each row in each select is duplicated for every row in sibling selects."

This leads me to believe that when you have two sibling selects, a data table should be created for each select (from every resource) then the two of those are CROSS JOIN'ed i.e. cartesian product'ed.

This test example is taken from basic -> two selects with columns:

  "resource": "Patient",
  "status": "active",
  "select": [
    {
      "column": [
        {
          "name": "id",
          "path": "id",
          "type": "id"
        }
      ]
    },
    {
      "column": [
        {
          "name": "last_name",
          "path": "name.family.first()",
          "type": "string"
        }
      ]
    }
  ]
}

and results in

id last_name
pt1 F1
pt2 F2
pt3 null

With the wording above, I would expect the result to be

id last_name
pt1 F1
pt1 F2
pt1 null
pt2 F1
pt2 F2
pt2 null
pt3 F1
pt3 F2
pt3 null

I believe that the wording could be re-worked to say something along the lines of "for each resource, the data tables resulting from sibling selects are CROSS JOIN'ed. Then the rows from the cross join are union'ed into the final data table" if the desired result in the test is correct.

arjun commented 4 months ago

NR: probably bug in the FP engine being used. What should be behavior? Consult with John and Brian P.

johngrimes commented 1 month ago

I think this is covered off in the Processing Algorithm section, please re-open if there are any concerns.