Rare-Technology / HHS_Dashboard

Socioeconomic survey dashboard
https://portal.rare.org/en/tools-and-data/household-survey-data/
MIT License
0 stars 1 forks source link

Add new plots from questions that are not mapped yet #12

Open abelvaldivia opened 3 years ago

abelvaldivia commented 3 years ago

@zross There are at least 17 HHS questions that are still not in the app. These questions should be included and are: Q7, Q15, Q20, Q21, Q30e-Q30i, Q35, Q48, Q49, Q50, Q57, Q58, Q62, Q69, Q71, Q74, Q75, Q76. In the hhs_questions file, these questions are under the column "question_no_included". To add the question to the select menu in the UI, just move the question from the "question_no_included" column to the "question" column. For each question, a code should be written to get the summary plot and summary table.

zross commented 3 years ago

@Court78 a couple of questions for you:

  1. How do I get the data for these other questions? If you look at this script you can see that I've set up a separate file with the URLs for the data, can someone provide these for me? Or perhaps you want to give me data world access?

  2. Are all of these questions multi-answer? In that script file with the URLS you can see that the multi-answer questions are separated and have their own part of the list.

Court78 commented 3 years ago

@zross George is giving you access to data.world now. I think we should jump on a call to go through it

Court78 commented 3 years ago

@zross HHS data is here

https://data.world/rare/social-science-data

Court78 commented 3 years ago

@zross Sorry one more thing. We are having someone consolidate the datasets that are currently in data.world. Let's chat briefly

zross commented 3 years ago

OK

Court78 commented 3 years ago

Hi Zev,

Attaching the HHS doc with the relevant questions highlighted

HHS Instrument_V2_62719.docx

Court78 commented 3 years ago

@zross Below are the dataset sources for missing questions

Q7 - hh_people - multi answer, table, mean per household (need to hold off one this one. sorting out data issue) Q15 - hh_activities - multi answer, bar, mean Q20 - hh_surveys - multi answer, bar, % households (show each answer choice) Q21 - hh_surveys - single answer, bar, % households (show each answer choice) Q30 - hh_surveys - multi answer, table, % households (show each answer choice) Q35 - hh_surveys - multi answer, table, % households (show each answer choice) Q48 - hh_enforcement - multi answer, bar, % households (stacked bar for yes - yes male/yes female) Q49 - hh_surveys - single answer, table, % households (show each answer choice + No answer (blank)) Q50 - hh_surveys - single answer, table, % households (show each answer choice) Q57 - hh_surveys - single answer, table, % households (show each answer choice) Q58 - hh_surveys - single answer, bar, % households (show each answer choice) Q62 - hh_surveys - single answer, table, % households (show each answer choice) Q69 - hh_customers - multi answer, table, % households (show each answer choice) and mean unit price Q71 - hh_surveys - multi answer, table, mean Q74 - hh_surveys - single answer, table, mean Q75 - hh_surveys- single answer, table, mean Q76 - hh_surveys- single answer, table, mean

hh_activities, Q15 = https://query.data.world/s/vh35m4iydwb6ii7olxeiydgj63zgjl hh_enforcement, Q48 = https://query.data.world/s/6bowin4lfthqltrmvjmhi5ern4uqxe hh_people, Q7 = https://query.data.world/s/jwlx4blr6tn76asurbxcp7ymgriwtq hh_customers, Q69 = https://query.data.world/s/oltaab47rb5ixiwlecxemhrwunbnej hh_leadership, Q45 = https://query.data.world/s/w4o2im3qbp7mmv4oajax2q5iowmxeg hh_meetings, Q44 = https://query.data.world/s/5kdowi3cmdyd4mayrdgdvzh6eqq4cj hh_responsibilities, Q14 = https://query.data.world/s/7irm5jn2zjyza2anjfpinzhzxhnlha hh_surveys, all other Qs = https://query.data.world/s/sua67ju6acax2o3zyw4de4wbouqyhv

zross commented 3 years ago

@Court78 we agreed to the following steps:

  1. You will edit the comment directly above and add (a) whether it's multi answer or single answer and (b) if you can easily tell me you will tell me what kind of plot (bar, stacked bar, faceted bar) and what values to compute (proportion? average? etc)
  2. I will pick two different plot types and code it up and then you can decide if you want me to finish the others or you will do the coding.
zross commented 3 years ago

@Court78 I know you're busy :) I think that doing this one issue would allow me to start moving forward with adding questions.

Court78 commented 3 years ago

@zross OK. I updated the table above. Hopefully, it makes sense :) I went with tables on most of these because I thought faceted bars would be confusing when looking at multiple MA areas

zross commented 3 years ago

Note to myself: In the link to data.world, click on the Launch Workspace button

zross commented 3 years ago

@Court78 the URLs for reading from data world that I got from Abel are no longer working -- I get a 404 error.

I can create new URLs by:

  1. Clicking on the table
  2. Clicking on download
  3. Choose embed

But the ones from Abel no longer work. Do you know why?

I need to be able to fill in this file: https://github.com/Rare-Technology/HHS_Dashboard/blob/zr/main/data-raw/urls-data.R

Court78 commented 3 years ago

George had to recreate these queries in data.world so the IDs changed. I can update!

zross commented 3 years ago

Depending on how much you want to do yourself vs me you can add the URLS for the other questions and that would save real time for me :)

zross commented 3 years ago

Note that file is on the branch zr/main

Court78 commented 3 years ago

Yep! I can add URLs for the other questions

zross commented 3 years ago

@Court78 in email you asked:

I’m updating the data.world links and am a bit confused by some are queried by country and others aren’t. Do you need them queried by country? Would it make more sense to provide a link to the full dataset and then just reference that for any questions that depend on it? Looks like they are joined by submission ID.

I was confused by this, I don't know why it was done this way. But since it was provided to me both ways, I have code to handle both ways. When it's by country I just stack them into a single file.

Easiest for me is a single URL for all countries but providing it country-by-country is fine also.

Court78 commented 3 years ago

@zross I added data.world links to the table above with the Qs. Let me know if you run into any issues

zross commented 3 years ago

Notes to myself on progress:

  1. On branch zr/git12-newQ
  2. Working on question 20 and creating function for multi-variable situations -- you can find the function in utils-data.R, prep_data_for_plot_multivar and the attempt is in plot_q20_gear.
zross commented 3 years ago

Note to myself on steps needed for new questions:

Assuming they are in the dataset:

  1. Change hhs_questions so that the question is moved from question_no_included to question. This is done in process-hhs-questions.R

  2. Add the name of the plot function (e.g., plot_q20_gear) to the list of plot functions in constants.R

  3. Add the corresponding .R file (e.g., `plot_q20_gear.R).

  4. Add the code