rcpch / rcpch-audit-engine

Epilepsy12 Audit Platform
https://e12.rcpch.ac.uk/
GNU Affero General Public License v3.0
5 stars 4 forks source link

Eatyourpeas/issue891 #980

Closed eatyourpeas closed 1 month ago

eatyourpeas commented 1 month ago

Overview

This PR fixes #891 - this was a feature request to use the maps to represent 2 types of data

  1. a scatter plot of each organisation's children
  2. a choropleth plot for the different health regions to show numbers of children in a given cohort

In addition to the scatter plot it was also possible to show distance from lead centre in the hoverlabel for each child, and some aggregated data for all children in that organisation to include average distance, min / max / std distance travelled.

In the process, chart js and leaflet were both deprecated and removed, and the plotly client library was added as a dependency in static.

The London borough maps were removed and new annotate queries were written to aggregate children by sex, age category, index of multiple deprivation and ethnicity. These had been there, but were not filtered for lead centre, so this was fixed. The new age category plot was added.

Code changes

aggregate_by - a new function to add age ranges to plot as a piechart for infants <1y, preschool children 1-5y, primary school children 5-12y, adolescents 12-18y and >18y

filter_cases_for_geography: a new function and file with a query to annotate cases in an organisation with distance to hospital. This required an addition to the Case model to retrieve longitude and latitude against postcode.

map_from_shape: functions to create a choropleth map using the geojson stored in models with styling using Plotly, and functions to annotate Case queries to return a dataframe of numbers of children in a given region

piechart_from_aggregated_data: this took the aggregation queries of ethnicity, index of multiple deprivation, sex and age_range and creates new functions to generate Plotly pie charts, with styling.

scatterplot_from_cases: This generates a Plotly scatterplot and has a function which generates a dataframe for this purpose of aggregated distances to add to the scatterplot footer.

`tiles_for_region: largely unchanged. Some minor refactoring

postcode: a new function to return latitude and longitude for a postcode

seed: adds two new flags to the seed cases function: --skip that allows the user to avoid skipping seeding if cases already exist in the database, and --organisations which accepts a list of ODS codes to replace the default list. Both are optional.

organisation_views: refactor to make calls to the new charting and query functions and pass these in the context to the selected_organisation_summary template

selected_organisation_summary: changes to the template to move pie charts out of their own row, into 2 columns in a single row alongside the scatterplot as a third column in that row. The 3 choropleth charts (2 for the welsh) are in a row below this.

settings.py: add the Mapbox API key: note this will also need to be added to the environment variable file

Remaining changes remove any deprecated JS dependencies, or update css

Documentation changes (done or required as a result of this PR)

No real implications for documentation. A new section on charts needs writing.

Related Issues

closes #891 and closes #883

eatyourpeas commented 1 month ago

Because pictures are sometimes clearer than text.

image

eatyourpeas commented 1 month ago

@dc2007git because this is quite a large new feature I would be grateful if you could review this? Need not be a line by line review but would like to check:

  1. Does it render the same for you?
  2. Does it look sensible if there is no data for a given organisation?
  3. Does it look sensible for Welsh organisations? On reflection, for example, although I have added information icons for the pie charts to explain that these reflect all children under the care of this organisation as lead across the cohorts, we should make it explicit that the choropleth plots reflect case density only in that cohort. So that would involve adding an info icon and a tooltip to explain that. I can do this or you, as part of your review?
dc2007git commented 1 month ago

@eatyourpeas sure thing, happy to take a look, and have no probs adding the tooltip. Any questions I'll drop you a message :)

dc2007git commented 1 month ago

@eatyourpeas the maps very much seem to work for Addenbrooke's hospital, showing the location of the hospital in its ICB, NHSregion and country. The number of cases illustrated by the tooltip also works as intended.

However, the cases map seems to have a bug, where only one child is shown. I have seeded my database, and there are 25 children in cohort 6 registered for addenbrookes, but only 11 appear as registered. Only 11 have completed their first year of care, so this makes sense. However, on the Cases Map for Addenbrookes, only 1 child appears.

An organisation having no data seems to be sensibly shown on both the maps and the pie chart. I'm definitely of the opinion that if there is no data then we should have a completely empty box, as opposed to a pie chart showing 0/0. If we go this way for some of the pie charts, we should probably do the same for the others (I'm thinking completed patient records sitting at the top right box).

If I'm being pedantic, overlaying the NHS regions and ICBs onto the map leaves some geographical data, like city names, slightly obstructed (as opposed to completely, which will look tidier):

image

For the Welsh organisations, I think that in the Local Health Board area, it looks great and intuitive. But the empty white box could potentially be removed, so that on the lowest row there is only local health board and country?

dc2007git commented 1 month ago

It also looks like the cohort is not loading into the already existent tooltips - it should say '.... registered within cohort 6 at...'. Perhaps cohort isn't passed in as context properly?:

image
eatyourpeas commented 1 month ago

This is very helpful feed back thank you - the functions for filtering are in ilter_all_registered_cases_by_active_lead_site_and_cohort_and_level_of_abstraction in filter_by_geography.py. Perhaps a second pair of eyes to find why? I would be grateful if you could have a look and we can do together if no obvious solution.

Yes, I agree about the white space in the Welsh screens.

dc2007git commented 1 month ago

@eatyourpeas will take a look and let you know what I manage to find!

dc2007git commented 1 month ago

@eatyourpeas after a lot of logger.debug() statements, I found that all of the agent smiths that my database is seeded with are given the same postcode, and as such, it looks like the map only shows one point per postcode! I changed a couple of postcodes and it looks like it's working now, but this is a consideration for if more than 1 case has the same postcode, which is more likely in urban areas like London with lots of housing in one area (apartment buildings, etc).

Could I just ask you to change a couple of postcodes for your agent smiths in one organisation too, just to make sure that we're in the clear? Cheers!

eatyourpeas commented 1 month ago

That is fine - it was working for me and that must be the reason. @AmaniKrayemRCPCH has requested we add labels to explain the different tiles to explain what they are showing - so perhaps let's add little info icons to the titles of the choropleths with text to explain these tiles represent numbers of children in cohort 6. If you are happy with that then feel free to merge.

dc2007git commented 1 month ago
image

This is what I've added

dc2007git commented 1 month ago

Closes #983