Unexpected visualization of results

missyschoenbaum commented 6 years ago

Map draws really strangely. Legend is gone.

This is based on #878, but I repeated it several times and got same results. I closed and reopened. I changed iteration count, and spread methods.

I acknowledge that it is testing on an unlikely scenario, as I put in a new population and just hooked stuff up with no reason. Will post up files to google drive, but will be tonight.

missyschoenbaum commented 6 years ago

I'm getting the same result on another test. Using the "pie" population again using sample scenario paramters. Will try this without the pie.

BryanHurst commented 6 years ago

@missyschoenbaum I'd like to talk about this one more at the meeting this morning.

BryanHurst commented 6 years ago

We are going to look further into this to see if the data is bad, or if this is just a problem with the presentation.

missyschoenbaum commented 6 years ago

Looking at the data, it seems to have extreme airborne spread. I'm trying to back this off and run again with no airborne to see what happens. Query to look at data was SELECT --* iteration, day, --production_type_id -- not useful, instead use case statement to pull in real name or assign name CASE WHEN name IS NULL THEN "ALL" ELSE name END as productiontype, last_day, -- Transition State Daily Unit tsdUSusc, tsdULat, tsdUSubc, tsdUClin, tsdUNImm, tsdUVImm, tsdUDest, -- Exposure by UNIT COUNTS expcU, -- NAMING EXAMPLE Exposure Cumulative by Unit expcUAir, expcUDir, expcUInd, expnU, -- NAMING EXAMPLE Exposure New by Unit expnUAir, expnUDir, expnUInd, adqcU, adqnU, -- Infection by Unit infcU, infnU, detcU, detnU, descU, Subset.xlsx desnU, deswU FROM Results_dailybyproductiontype r Left join -- left join here because the NULL production type indcates ALL, note case statement to manage this ScenarioCreator_productiontype pt on r.production_type_id = pt.id -- Example of WHERE clause WHERE 1=1 AND production_type_id is null Order by 1, 2,3

missyschoenbaum commented 6 years ago

Tinkered around and got a fairly normal set of results that are still drawing blue. Will upload Test896_bluescreen3 tonight.

subset_export.xlsx

missyschoenbaum commented 5 years ago

It also seems like this only happens when I replace a population. I am continuing to explore this.

missyschoenbaum commented 5 years ago

I'm not comfortable with the outputs after all. It is hard to judge when grabbing an example and trying to apply known parameters in a way that was not well research. I think we should at least look at how the C engine is interacting with the data.

missyschoenbaum commented 5 years ago

Can we let Conrad look at this?

missyschoenbaum commented 5 years ago

I tried running this apples to apples. I took the Sample Scenario, replaced the population with the same exact population. I did not get a blue screen, but I got different results. This should still be running from the set seed, so they should be consistent. BlueScreenApplestoApples.docx

missyschoenbaum commented 5 years ago

I think this answers the question as to the error being in the results vs the visualization. Let me run the parameter report before we take another step to ensure that I didn't mess up a parameter block.

BryanHurst commented 5 years ago

After running line by line through the code with side by side scenarios, I can say that the large blue background is coming from 'zone_blues' on line 133 in 'interactive_graphing.py'.

The code is actually performing correctly, and this is because zone and population data seems to be unlinked in the bad examples.

I believe this is related to #915 in that we have some dangling data/relations that need to be handled when we do a replace population.

ConradSelig commented 5 years ago

zone_blues is just a color pallet for when the zones are drawn - it doesn't decide how the zones are being drawn, which is important because it's obviously being drawn incorrectly (maybe).

Two variables determine the size of a zone, the first being largest_zone_radius. This variable is determined by the zones created by the user. For example in the sample scenario the largest_zone_radius is 15.0. The second variable is kilometers_in_one_latitude_degree, which is hard-coded at 111.13. A zones radius (when being drawn) is equal to largest_zone_radius / kilometers_in_one_latitude_degree or 15.0 / 111.13 or 0.1349~.

This works great for scenarios like the sample scenario where scaling seems to be correct, when looking at the population screen both latitude and longitude range from 32 to 38. On scenarios like the pie population on the other hand things are different. Latitudes range from 40.45 to 40.65 (.2 difference!) and longitudes range from -0.1 to -0.25 (.15 difference!). Get this however, the zone radius is the SAME in Test896_bluescreen3 - 0.1349~. This means a single zone spans almost the entire map!

I think the zones are drawing at the correct size, and the scaling of our population is off. With infected units all around the circle of the pie population of course zones would be drawn all over the entire map, and because the zones are so big they appear to be the background.

Take a look at these "annotated" screenshots - you can spot some area where the zones did not reach the full color in the botton left (under the legend), you can see by the curvature of the lines two "zones" and how big they are drawing.

And a screenshot of the scaling on this population, straight from the population screen:

So here is my "summary hypothesis": Zones are drawing correctly and only appear so big because the scaling of the population is not being account for when determining the zone radius, meaning populations with very small scales appear have very large zones being drawn.

A couple options moving forward (as I can see it):

Add some sort of scalar into the radius calculation so zones are drawn at the expected size
Perform some level of population cleaning to ensure populations are at an expected size.

@BryanHurst @missyschoenbaum How do we want to move forward on this?

missyschoenbaum commented 5 years ago

@ConradSelig So, I think the bottom line is that I have made us spend a long time on something that is not a real problem. I was thinking I had this happen with a Texas population also. Do you mind looking in the Google drive, under 896. There is a folder called something like blueresutltsdifferentpop and seeing it it has the same issue?

ConradSelig commented 5 years ago

blueresultsdifferentpop looks to be having the same problem.

I think it would be possible to scale our zone circles somehow - if you want to put the time into that.

missyschoenbaum commented 5 years ago

Wow, it's real data then. I won't call it by name. Can you give me an estimate? I realize that the estimate may take some of the time in solving it.

ConradSelig commented 5 years ago

I know exactly where to change the size of the zones, the question is how to get the map scale and how to manipulate that to get zones that are a good size.

It's going to be a lot of trial and error, but I imagine it wouldn't take more than a day to complete.

missyschoenbaum commented 5 years ago

It may be hard to judge what the full effect is going to be. Let's try because all blue is really weird.

ConradSelig commented 5 years ago

Well the good news is that a day was a vast overestimation of how much time this would take.

This change will require LOTS of testing to ensure that maps are being produced that look as expected. A few additional notes about this change:

Displayed zone sizes are no longer dependent on user-defined zones, meaning visual zones are an even more generalized reference of how the disease spread.
Scenarios with very small scales still draw unexpectedly but this is due to unit squares being a set constant size (each unit gets a square with 3 bars displaying different "unit outcomes" - see output map key).
Zones are drawn to be 1/50 the width of the total map, this value can be changed by simply editing two scalar values in interactive_graphing.py (graph_zones()). This size was chosen based of a relatively small sample of "regularly" scaled scenarios to most closely match their previous map outputs.

BryanHurst commented 5 years ago

Good catch on this one, I glazed over the code in interactive_graphing because I hate figuring out the math. Though the ticket note in the file should have given a clue.

I like the solution you have to tie the zone size to the map size, though agree that we need to test some extremes with this solution in place.
It would be really good to run a simulation on a LARGE real population. @missyschoenbaum will probably need to do that.

missyschoenbaum commented 5 years ago

We may always have some case where zones visualize strangely. I think I will post a known bug regardless of our outcome, and then we can note the comment about how very small scales may act. Really, we cannot predict every possibility, but we can try to manage most.

missyschoenbaum commented 5 years ago

holding to wait on #915

missyschoenbaum commented 5 years ago

I like this a whole lot better, even though it is probably about the same result.

REplacePopwithCircle

missyschoenbaum commented 5 years ago

Let me exercise it against another pop before I close, but I think we are on the right track.

missyschoenbaum commented 5 years ago

2nd one looks great. TexasREplace Calling this good.

NAVADMC / ADSM

Unexpected visualization of results #896