Open sebbacon opened 3 years ago
Notes from call. The opt-out is 100s of thousand (e.g. 900k?). Should get the data approximately tomorrow
The list of organisations being used is a static list. They are updating to be from all NHSE open, and closed GP practices, with a future date.
Also note that there are various business rules which exclude some kinds of patients e.g. temporary patients. We're getting a list of business rules by tomorrow from Nas, for use in the paper.
On the patient counts they've been investigating.
There are four reasons for differences:
What they call "rebulking the data" - 20 billion obs - very big job.
They hvae a job in airflow which they trigger which takes a day or so.
The jobs are monitored by ops people (e.g. noticing resources running out or presto having had a restart). On that day, presto restarted, and a job was restarted but not gracefully.
They have a theory about the 200k patients:
This is what they do right now
If a practice is marked as closed some time between two time points, it will be automatically excluded (in the highlight).
So we can verify this by counting the practices.
Lots are marked for closed with a future date (i.e. merging).
Dima has 4 screens
Here is the bit with the static organisations:
Confused by the patient thing. They have a list of fixed ou
The studied population contains about 1,493,041 fewer patients than EMIS reported to NHSD in January 2021, per the following:
(precise numbers here)
Per #2, I've been told that more than 1m patients are expected to be added in the latest build, so this issue is probably going to be resolved.
However, we should do the following:
Potentially do all of these as time-series plots.
We will need to keep historic data from the previous run to do the delta; we could perhaps handle this by just writing to a (series of?) data files that only exist on the server.