NYCPlanning / labs-factfinder

New York City Census Reporting Tool
Other
40 stars 12 forks source link

single geog: percentage point difference not all always graying #545

Closed EricaMaurer closed 6 years ago

EricaMaurer commented 6 years ago

https://nyc-factfinder.planninglabs.nyc/profile/735/social?charts=false&comparator=SI07&reliability=true image

allthesignals commented 6 years ago

Screen Shot 2018-03-05 at 10.36.09 AM.pngPossibly related to #548

The percentage point estimate and moe columns should be gray (insignificant ) and they'ren ot

allthesignals commented 6 years ago

So this is related to percentage point difference significance, but your PR added code for the significance calculation which is different. Gonna look at the SQL and create a test

EricaMaurer commented 6 years ago

Ok cool. #528 was for change, this issue is difference

allthesignals commented 6 years ago

I think I meant this PR https://github.com/NYCPlanning/labs-factfinder-api/pull/31/files

allthesignals commented 6 years ago

What are the inputs for this significance calculations? Trying to double check formulas.

EricaMaurer commented 6 years ago

percentage point est and MOE

EricaMaurer commented 6 years ago

https://github.com/NYCPlanning/labs-factfinder-api/pull/31/files was for special variables

allthesignals commented 6 years ago

Alright, I have a failing test for this.

The math for this is as follows:

WHEN ABS(
  SQRT(
    POWER(coalesce(percent_m, 0) / 1.645, 2) 
      + POWER(coalesce(comparison_percent_m, 0) / 1.645, 2)
  ) * 1.645 ) > ABS(comparison_percent - percent)
THEN 
  false
ELSE true

I think what's happening is that comparison_person is null, so it can't perform math on an unknown value. That pollutes the entire calculation because it can't do boolean logic on a null value.

This is a tricky thing in some languages - JavaScript is fast and loose with nulls and embraces the "truthiness" concept where null is considered "falsey" and therefore evaluatable in boolean logic.

In our SQL script this seems like a bigger issue of how to deal with nulls and 0s mid-calculation - sometimes we should assume null is 0 for additional calculations mid-stream, but those values shouldn't be null when they are served out to the app.

PR incoming.