display mean values for cohort and for higher organizational levels

andrewsu commented 1 year ago

Suppose we have a report generated for a specific lab, say "NEURO LAB 1". For many questions, we will have responses that correspond to a scale, e.g., 'Strongly agree', 'Somewhat agree', 'Neither agree nor disagree', 'Somewhat disagree', 'Strongly Disagree'. Those currently are ordered and visually displayed in a bar chart.

In this issue, I propose also calculating a numeric score from the responses to a given question. We might do this by assigning a score for each answer, e.g.,

'Strongly agree' = +2
'Somewhat agree' = +1
'Neither agree nor disagree' = 0
'Somewhat disagree' = -1
'Strongly Disagree' = -2

The responses could then be averaged, and that average could then be shown on the PDF report.

The average for a given question in the report could then be compared to the average for the same question at higher organizational levels. For example, if "NEURO LAB 1" is the "Department/Org Level 1", then the "Division/Org Level 2", corresponds to "NEUROSCIENCE - CA", and the "Strategic Unit/Org Level 3" corresponds to "ACADEMIC RESEARCH". For a given question, the report could include the average of responses for each of those three levels, as well as the Institute average.

Similarly, suppose we are generating a report for the "NEUROSCIENCE - CA" level specifically for respondents who provided a "gender identity" answer of "Female". In addition to computing the average of responses for all Female respondents in "NEUROSCIENCE - CA", we would also show the average for all Female respondents in "ACADEMIC RESEARCH", and all Female respondents Institute-wide.

rjawesome commented 1 year ago

Scores have been added (but not comparisons yet). Example:

The scales are: 2,1,0,-1,-2 or 3,2,1,-1,-2,-3 depending on whether the sequence contains odd or even "entries" (The example for the picture would be Very likely = 3 Likely = 2 Somewhat likely = 1 Somewhat unlikely = -1 Unlikely = -2 Very unlikely = -3 )

rjawesome commented 1 year ago

Some score comparisons have been implemented, with the deparment and gender/ethnicity of the report. Example (NEURO LAB 2,Female report): I'll probably move some of the info off of the title because it is getting a bit messy

For this:

In addition to computing the average of responses for all Female respondents in "NEUROSCIENCE - CA", we would also show the average for all Female respondents in "ACADEMIC RESEARCH"

I was confused about how you would tell that neuroscience is directly a part of academic research. I thought the higher organizational levels were not perfectly hierarchical?

andrewsu commented 1 year ago

I was confused about how you would tell that neuroscience is directly a part of academic research. I thought the higher organizational levels were not perfectly hierarchical?

I think the org levels will be perfectly hierarchical. If you see a case where that's not true, that's likely an error in how I created the sample data. Please let me know...

So with that context, I think the comparisons for the NEURO LAB 2, Female report should be NEUROSCIENCE - CA, Female, ACADEMIC RESEARCH, Female, and Institute, Female. (I think I've been using "Institute" where you've been using "All" -- same meaning I think...)

I'll probably move some of the info off of the title because it is getting a bit messy

Makes sense to me. I think below the legend could be a reasonable place for this info...

rjawesome commented 1 year ago

Reports added for "parent" organizational levels. I also moved the scores to the side next to the title (see the picture, same Neuro Lab 2/Female)

andrewsu commented 1 year ago

So I think this is going to be too many comparisons to be easily interpretable. Please modify so you only print the scores for the other organizations levels with the same demographic filters. So, for the NEURO LAB 2, Female report, the only comparison numbers listed should be for

NEUROSCIENCE - CA, Female
ACADEMIC RESEARCH, Female
Institute, Female

Similarly, in the NEURO LAB 2, Male report, the only comparison numbers listed should be for

NEUROSCIENCE - CA, Male
ACADEMIC RESEARCH, Male
Institute, Male

And for the NEURO LAB 2 (all responses) report, the only comparison numbers listed should be for

NEUROSCIENCE - CA (all responses)
ACADEMIC RESEARCH (all responses)
Institute (all responses)

Let me know if that doesn't make sense...

rjawesome commented 1 year ago

Should be done. Example:

andrewsu commented 1 year ago

Two notes/requests around organizational levels please...

NEURO LAB 2+Female.pdf doesn't report stats for NEUROSCIENCE - CA+Female.pdf, but it should because NEUROSCIENCE - CA is the L2 organizational level for NEURO LAB 2. Similar for NEURO LAB 2+Male.pdf...
Change ordering of comparison groups in the PDF. For example, for PLANNING, DESIGN & CONST.pdf, report the scores from most granular to most general:
- L1: Report score
- L2: Facilities
- L3: Administration
- L4: Institute

In addition, I'm mixing in a few other change requests (that don't directly relate to organizational levels, but just to keep these all in one issue):

race/ethnicity reports should be tested at all levels (L1 - Institute). I committed a new sample data file data/sample_survey_data_20230904.xlsx that should trigger race/ethnicity reports for SECURITY & CAMPUS SVCS (L1) and MOLECULAR MEDICINE - CA (L2).
"Moderate" should be more favorable than "somewhat". So, "Very prepared" > "Moderately prepared" > "Somewhat prepared" > "Slightly prepared"
Treat "Does not apply to me" as missing data, not the most unfavorable option
Q63 has a report score range -3.0 to 3.0, but I think there are only five options there. Is that a bug or am I missing something?

rjawesome commented 1 year ago

NEURO LAB 2+Female.pdf doesn't report stats for NEUROSCIENCE - CA+Female.pdf, but it should because NEUROSCIENCE - CA is the L2 organizational level for NEURO LAB 2. Similar for NEURO LAB 2+Male.pdf...

No reports are generated by gender on NUEROSCIENCE - CA. Hence it is not present in the NUERO LAB 2 report.

Q63 has a report score range -3.0 to 3.0, but I think there are only five options there. Is that a bug or am I missing something?

Not seeing this when running it on my computer, so I'm not sure where you are seeing this.

race/ethnicity reports should be tested at all levels (L1 - Institute). I committed a new sample data file data/sample_survey_data_20230904.xlsx that should trigger race/ethnicity reports for SECURITY & CAMPUS SVCS (L1) and MOLECULAR MEDICINE - CA (L2).

It seems like there is one person who selected "Other" in these groups. in both of these. Since 1 < 5, no reports are generated by race/ethnicity.

Change ordering of comparison groups in the PDF.

Fixed.

"Moderate" should be more favorable than "somewhat". So, "Very prepared" > "Moderately prepared" > "Somewhat prepared" > "Slightly prepared"

Fixed

Treat "Does not apply to me" as missing data, not the most unfavorable option

Fixed

andrewsu commented 1 year ago

NEURO LAB 2+Female.pdf doesn't report stats for NEUROSCIENCE - CA+Female.pdf, but it should because NEUROSCIENCE - CA is the L2 organizational level for NEURO LAB 2. Similar for NEURO LAB 2+Male.pdf...

No reports are generated by gender on NUEROSCIENCE - CA. Hence it is not present in the NUERO LAB 2 report.

Great point. I agree, this is the desired behavior.

Q63 has a report score range -3.0 to 3.0, but I think there are only five options there. Is that a bug or am I missing something?

Not seeing this when running it on my computer, so I'm not sure where you are seeing this.

Hmm, I'm no longer seeing this behavior, so let's consider it a temporary issue with my setup.

race/ethnicity reports should be tested at all levels (L1 - Institute). I committed a new sample data file data/sample_survey_data_20230904.xlsx that should trigger race/ethnicity reports for SECURITY & CAMPUS SVCS (L1) and MOLECULAR MEDICINE - CA (L2).

It seems like there is one person who selected "Other" in these groups. in both of these. Since 1 < 5, no reports are generated by race/ethnicity.

Thanks for catching this oversight. I slightly modified the sample data (now saved as data/sample_survey_data_20230904b.xlsx`) and confirmed that race/ethnicity reports are now generated for SECURITY & CAMPUS SVCS (L1) and MOLECULAR MEDICINE - CA (L2).

Change ordering of comparison groups in the PDF.

Fixed.

"Moderate" should be more favorable than "somewhat". So, "Very prepared" > "Moderately prepared" > "Somewhat prepared" > "Slightly prepared"

Fixed

Treat "Does not apply to me" as missing data, not the most unfavorable option

Fixed

Perfect! I think this is a great version 1.0 (of the org level stats raised in this issue, and in the analysis script overall)!

andrewsu / mentorship-survey-analysis

display mean values for cohort and for higher organizational levels #2