Numerical Answers CSV should account for for multiple Q's

jmadans commented 9 years ago

Right now, the /numerical_answers only outputs data for one agree/disagree question. Per @andyhull's issue #191, I've got a hunch it's grouping all the data (all agree counts, all disagree counts) as the responses to one question.

With our pilot Cities starting to close their campaigns and start analysis I'd like to put this issue on the docket for this week's cityvoice o'cock. @daguar, do you think @migurski will be able to figure on his own?

daguar commented 9 years ago

I'd put this at ~2-3 CVOC's, and I'd probably have to help out.

daguar commented 9 years ago

Actually, let me clarify — if it is JUST the CSV, then it may be more like 1-2.

migurski commented 9 years ago

Approximate SQL we’d want:

SELECT c.location_id, a.question_id, a.numerical_response, count(a.id)
  FROM answers AS a
  JOIN calls AS c ON c.id = a.call_id
 WHERE a.numerical_response IN (1, 2)
 GROUP BY c.location_id, a.question_id, a.numerical_response
 ORDER BY c.location_id, a.question_id, a.numerical_response

Approximate display:

 location  question     response  count 
----------------------------------------
                  1        agree     13
                        disagree      1
          ------------------------------
                  2        agree      4
                        disagree     10
    1     ------------------------------
                  3        agree      9
                        disagree      5
          ------------------------------
                  4        agree      9
                        disagree      5
----------------------------------------
                  1        agree      2
          ------------------------------
                  2     disagree      2
    2     ------------------------------
                  3        agree      2
          ------------------------------
                  4        agree      1
                        disagree      1
----------------------------------------
                  1        agree      1
                        disagree      2
          ------------------------------
                  2        agree      2
                        disagree      1
    3     ------------------------------
                  3        agree      2
                        disagree      1
          ------------------------------
                  4        agree      2
                        disagree      1

migurski commented 9 years ago

We could also display question-first, if that’s more appropriate for the audience.

SELECT a.question_id, a.numerical_response, c.location_id, count(a.id)
  FROM answers AS a
  JOIN calls AS c ON c.id = a.call_id
 WHERE a.numerical_response IN (1, 2)
 GROUP BY a.question_id, a.numerical_response, c.location_id
 ORDER BY a.question_id, a.numerical_response, c.location_id

 question    response  location  count 
---------------------------------------
                              1     13
                agree         2      2
                              3      1
    1     -----------------------------
             disagree         1      1
                              3      2
---------------------------------------
                agree         1      4
                              3      2
    2     -----------------------------
                              1     10
             disagree         2      2
                              3      1
---------------------------------------
                              1      9
                agree         2      2
                              3      2
    3     -----------------------------
             disagree         1      5
                              3      1
---------------------------------------
                              1      9
                agree         2      1
                              3      2
    4     -----------------------------
                              1      5
             disagree         2      1
                              3      1

daguar commented 9 years ago

I'd make the argument that this page should show aggregate results across all locations (rather than splitting it out by location).

An illustrative example is something like this (though we'd have the same kind of %age bars as exist now):

My reasoning goes something like this:

The location pages already allow you to compare responses across locations
Some of the questions I've seen are not location-specific, and having a topline number is almost certainly really useful (for example, I have attended a evening program in the parks. and We need more evening programs in our parks. from Canton both seem like questions the city will want aggregate numbers on)
If we break down the numbers by location on this page, there is actually NO page to see aggregate numbers (so folks would have to manually add them, unless we did both aggregate AND location cross-tab on this page)

I think this is a question that'll bear itself out over more contexts and more city usage, but my thinking is getting aggregate numbers is the highest value marginal step here. @jmadans would love your thoughts based on the existing pilot city questions/experience.

daguar commented 9 years ago

I've got a crappy-looking (what am I, a designer?) implementation of the Activity page fix that looks like this:

codeforamerica / cityvoice

Numerical Answers CSV should account for for multiple Q's #230