UI-Research / mobility-from-poverty

https://ui-research.github.io/mobility-from-poverty/
6 stars 1 forks source link

20. Update ratio of pay on the average job to the cost of living #218

Closed awunderground closed 7 months ago

awunderground commented 1 year ago
awunderground commented 1 year ago

I don't love wage_ratio_quality

cdsolari commented 9 months ago

We are working to determine a good list of categories for industries. These are in review by Greg. I have two ideas. If you have suggestions, also open to those!

kmartinchek commented 8 months ago

I have created a new .do file to load in 2022 QCEW data, merge with existing 2022 MIT Living Wage data, and generate updated county-level metrics, which I have added to branch 218. I used 2022 QCEW because the annualized figures used in calculations are not available for 2023 yet (only Q1 and Q2 for 2023). To my knowledge, this indicator is not used in city-level metrics, so I did not generate those. I changed the variable name in the 2022 data update (as noted above), but did not in the 2018 and 2021 updates (& did not see files for 2014 in the current directory)-- should this be implemented for these years as well? I don't yet have any additional direction on the "industry" task listed above so will wait to proceed until I have further direction-- which it appears Claudia is working on!

cdsolari commented 8 months ago

We wanted to add a subgroup for industry. I found Standard Industry Categories (SIC), and it offers divisions (10 categories): https://www.osha.gov/data/sic-manual. After a meeting, we thought we could reduce this to 8 categories: Division A: Agriculture, Forestry, And Fishing & Division B: Mining Division C: Construction Division D: Manufacturing Division E: Transportation, Communications, Electric, Gas, And Sanitary Services Division F: Wholesale Trade & Division G: Retail Trade Division H: Finance, Insurance, And Real Estate Division I: Services Division J: Public Administration

But, I also realize these don't map one-to-one with the codes you have. I created a Box folder in Metrics_2024_round/Living_Wage_Jobs_Industry that has some spreadsheets I put together I think we can get those codes to align. I think we're also open to your ideas.

In the meantime, I am circling with our web development team to see if they have any red flags being able to display so many categories. I'd prefer to reduce these in meaningful ways, but to keep enough distinction to be useful.

kmartinchek commented 8 months ago

Claudia,

So I used the 2022_NAICS_Structure_Summary_Table to recode county-level NAICS codes into divisions and noticed some things I wanted to confirm with you before moving forward. First, there are no observations in the 2022 data at the county level for Division F, G, H, I, and J – which doesn’t seem right. Checking, there are several industries not listed in the Excel spreadsheet that are in the QECW data that are unassigned to divisions: @.***

I think we should recode this unincluded industries, but I am also looking further into why there aren’t any county-level data in QECW for the last 5 divisions—but wanted to share this update.

From: Claudia D Solari @.> Sent: Thursday, February 8, 2024 5:48 PM To: UI-Research/mobility-from-poverty @.> Cc: Martinchek, Kassandra @.>; Assign @.> Subject: Re: [UI-Research/mobility-from-poverty] 20. Update ratio of pay on the average job to the cost of living (Issue #218)

[EXTERNAL]

We wanted to add a subgroup for industry. I found Standard Industry Categories (SIC), and it offers divisions (10 categories): https://www.osha.gov/data/sic-manual. After a meeting, we thought we could reduce this to 8 categories: Division A: Agriculture, Forestry, And Fishing & Division B: Mining Division C: Construction Division D: Manufacturing Division E: Transportation, Communications, Electric, Gas, And Sanitary Services Division F: Wholesale Trade & Division G: Retail Trade Division H: Finance, Insurance, And Real Estate Division I: Services Division J: Public Administration

But, I also realize these don't map one-to-one with the codes you have. I created a Box folder in Metrics_2024_round/Living_Wage_Jobs_Industry that has some spreadsheets I put together I think we can get those codes to align. I think we're also open to your ideas.

In the meantime, I am circling with our web development team to see if they have any red flags being able to display so many categories. I'd prefer to reduce these in meaningful ways, but to keep enough distinction to be useful.

— Reply to this email directly, view it on GitHubhttps://github.com/UI-Research/mobility-from-poverty/issues/218#issuecomment-1935054260, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AU4XTGILJSLZLVY2QMKSDNDYSVIYHAVCNFSM6AAAAAA42ITVLGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMZVGA2TIMRWGA. You are receiving this because you were assigned.Message ID: @.**@.>>

cdsolari commented 8 months ago

Thanks for this information! We are re-considering the categories for this round of calculations because the Dashboard has limitations in the number of subgroup categories it can accommodate. I have an inquiry with leadership on this, and will circle back as soon as I have a response. Regardless, one of the categories would be public administration jobs, (Division J), and the other jobs would fall into other categories, so we'll still need to figure out what's going on with all the missingness. Thanks for flagging that. I have files in our Box folder that tries to map divisions to the QECW categories. I wonder if those are still missing in the non-NAICS industry codes, and if the items nested under industry codes 101 ("goods-producing), 102 ("service-providing"), and 1028 ("Public administration) also suffer from the missingness. These are related to the first 15 industry codes in this list: https://www.bls.gov/cew/classifications/industry/industry-titles.htm. And, I agree that we'll have to figure out what to do with the unclassified category. I wonder if we don't highlight that category and just note that in the "All" calculation, it includes those job cases.

kmartinchek commented 8 months ago

That makes sense—since we are waiting to hear back from leadership on the composition of categories (which will also inform how to attribute those with non-NAICS codes), I’ll pause on the subgroup calculations here until we have a bit more detail. 😊

From: Claudia D Solari @.> Sent: Tuesday, February 13, 2024 12:27 PM To: UI-Research/mobility-from-poverty @.> Cc: Martinchek, Kassandra @.>; Assign @.> Subject: Re: [UI-Research/mobility-from-poverty] 20. Update ratio of pay on the average job to the cost of living (Issue #218)

[EXTERNAL]

Thanks for this information! We are re-considering the categories for this round of calculations because the Dashboard has limitations in the number of subgroup categories it can accommodate. I have an inquiry with leadership on this, and will circle back as soon as I have a response. Regardless, one of the categories would be public administration jobs, (Division J), and the other jobs would fall into other categories, so we'll still need to figure out what's going on with all the missingness. Thanks for flagging that. I have files in our Box folder that tries to map divisions to the QECW categories. I wonder if those are still missing in the non-NAICS industry codes, and if the items nested under industry codes 101 ("goods-producing), 102 ("service-providing"), and 1028 ("Public administration) also suffer from the missingness. These are related to the first 15 industry codes in this list: https://www.bls.gov/cew/classifications/industry/industry-titles.htm. And, I agree that we'll have to figure out what to do with the unclassified category. I wonder if we don't highlight that category and just note that in the "All" calculation, it includes those job cases.

— Reply to this email directly, view it on GitHubhttps://github.com/UI-Research/mobility-from-poverty/issues/218#issuecomment-1942056896, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AU4XTGITQUANJCTWYC7YVDTYTOO47AVCNFSM6AAAAAA42ITVLGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNBSGA2TMOBZGY. You are receiving this because you were assigned.Message ID: @.**@.>>

cdsolari commented 8 months ago

Ok! It sounds like the Dashboard team has found a way to accommodate our use of 8 categories, assuming we can figure out the missing data problem. I think the goal is to resolve the missing data problem, and try to get our living wage for those 8 industries for the years that we can. The priority is getting it for the most recent year. If we can't resolve the missing data problem, we will not show any subgroups for this metric. Thanks! Let me know if you have any questions!

kmartinchek commented 8 months ago

Claudia,

Apologies for the delay! So I did some digging on the QCEW uses super-sector NAICS codes, not the NAICS codes themselves. This is how they map on to one another: @.***

So, the original groupings of divisions may not be wholly ideal here—as you can see, the QCEW codes (see super sector above) can’t always fit into the divisions as they sometimes over-aggregate. I propose grouping the goods producing codes together and then keeping the other super-sector codes as-is.

Hopefully this update helps!

From: Claudia D Solari @.> Sent: Tuesday, February 13, 2024 9:35 PM To: UI-Research/mobility-from-poverty @.> Cc: Martinchek, Kassandra @.>; Assign @.> Subject: Re: [UI-Research/mobility-from-poverty] 20. Update ratio of pay on the average job to the cost of living (Issue #218)

[EXTERNAL]

Ok! It sounds like the Dashboard team has found a way to accommodate our use of 8 categories, assuming we can figure out the missing data problem. I think the goal is to resolve the missing data problem, and try to get our living wage for those 8 industries for the years that we can. The priority is getting it for the most recent year. If we can't resolve the missing data problem, we will not show any subgroups for this metric. Thanks! Let me know if you have any questions!

— Reply to this email directly, view it on GitHubhttps://github.com/UI-Research/mobility-from-poverty/issues/218#issuecomment-1942996660, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AU4XTGISWJNNYLWG3J4RGWTYTQPENAVCNFSM6AAAAAA42ITVLGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNBSHE4TMNRWGA. You are receiving this because you were assigned.Message ID: @.**@.>>

kmartinchek commented 8 months ago
Screenshot 2024-02-15 152914
cdsolari commented 8 months ago

Ok, what do you think of:

  1. All industries
  2. Goods-producing
  3. Public administration
  4. Trade/transport/utilities
  5. Information services
  6. Financial/professional/business services
  7. Education/health services
  8. Leisure/hospitality/other services I vote that we exclude the unclassified category because it will be included in the all-industries and it isn't clear what those are or what to make of them.

I also wonder if the label for #7 is too long. Maybe we summarize that to "Professional services" and in the notes we say that it includes Financial & business?

cdsolari commented 8 months ago

For the quality flag variable name, I saw the comment that the suggested name was not liked. I suggest "ratio_living_wage_quality" Let's go with that because the web developers need to have that variable name finalized by Feb 20.

cdsolari commented 8 months ago

Our original goal was to calculate the new year of data, fill in any gap years since 2014, and generate subgroups for all those years. But, our priority is to get information on the most recent year. For this new subgroup, If it is easy to loop past years with the same subgroups code, that would be great. I know that already we had some missing categories and I'm not sure if we'll face some inconsistency like that in prior years. I'd rather have an accurate current year and no others than try to produce subgroups now for all past years and realize we have some odd trends because of back-end industry coding shifts.

kmartinchek commented 8 months ago

So I am very close to finalizing this metric-- there is just one little snafu to resolve. The wage data from QCEW is based on 2020 census geographies. This is an issue for one Alaska census area and Connecticut-- who moved to planning regions in 2022. Based on the county_populations crosswalk in the folder, it appears we are using 2022 geographies in this update-- which aren't available for these new geographies. I can either (1) use the old geographies or (2) use the new geographies and categorize CT counties as missing, or (3) something else that is a better idea. All other calculations are complete, just need to determine how to treat CT's new geographies given data constraints.

cdsolari commented 8 months ago

Note, that to make life easier for the developer, we should rename ratio_living_wage_quality to ratio_average_to_living_wage_quality. This makes for a very long variable name, however, it is otherwise the only metric where we do not use the same text as the overall metric name for its corresponding quality. As you finalize this code, please make that switch. Thank you!!!!

kmartinchek commented 8 months ago

Claudia,

Is there a way to shorten this by 4 characters? The max is 32 characters in Stata and this is 36 characters.

From: Claudia D Solari @.> Sent: Friday, February 23, 2024 11:53 AM To: UI-Research/mobility-from-poverty @.> Cc: Martinchek, Kassandra @.>; Assign @.> Subject: Re: [UI-Research/mobility-from-poverty] 20. Update ratio of pay on the average job to the cost of living (Issue #218)

[EXTERNAL]

Note, that to make life easier for the developer, we should rename ratio_living_wage_quality to ratio_average_to_living_wage_quality. This makes for a very long variable name, however, it is otherwise the only metric where we do not use the same text as the overall metric name for its corresponding quality. As you finalize this code, please make that switch. Thank you!!!!

— Reply to this email directly, view it on GitHubhttps://github.com/UI-Research/mobility-from-poverty/issues/218#issuecomment-1961669329, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AU4XTGINPMNDIEI5R4G7YYLYVDCPXAVCNFSM6AAAAAA42ITVLGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRRGY3DSMZSHE. You are receiving this because you were assigned.Message ID: @.**@.>>

cdsolari commented 8 months ago

Ok, here's the new solution: Please change the overall metric value name to "ratio_living_wage" and the quality variable will be "ratio_living wage_quality" this will keep us in the right character limit and help us stick to a format that the other metrics are sticking to as well. Thank you!

awunderground commented 8 months ago

PR is #287

awunderground commented 7 months ago

Closed with #287