EHDEN / CatalogueExport

Exports the data from the OMOP-CDM that is necessary for the EHDEN Database Catalogue
Apache License 2.0
9 stars 6 forks source link

717 row number missing partition by stratum_id #62

Open MarkoClossOPC opened 3 months ago

MarkoClossOPC commented 3 months ago

I believe there is a bug in the 717 analysis query.

In the statsView cte (line 24) the row number is calculate by row_number() over (order by count_value) as rn

statsView (stratum_id, count_value, total, rn) as
(
  select stratum_id, count_value, count_big(*) as total, row_number() over (order by count_value) as rn
  FROM rawData
  group by stratum_id, count_value
)

Shouldn't this be row_number() over (partition by stratum_id order by count_value) as rn as in the 716 query?

Since statsView is evaluated twice in the PriorStats cte the row numbers will not necessarily match with row_number() over (order by count_value) as rn giving incorrect results due to the <=

priorStats (stratum_id, count_value, total, accumulated) as
(
  select s.stratum_id, s.count_value, s.total, sum(p.total) as accumulated
  from statsView s
  join statsView p on s.stratum_id = p.stratum_id and p.rn <= s.rn
  group by s.stratum_id, s.count_value, s.total, s.rn
)