alan-turing-institute / uatk-spc

Synthetic Population Catalyst
https://alan-turing-institute.github.io/uatk-spc/
MIT License
20 stars 12 forks source link

Labor Market use case #36

Closed mfbenitezp closed 1 year ago

mfbenitezp commented 2 years ago

In conversations with @oguerrer and Kathyrn Fair from Shocks and Resilience, we will explore the potential use of SPC output for the study of labor market in the UK.

mfbenitezp commented 2 years ago

After a more proper conversation with Kat, here are the fields are requiered to be supported by SPC.

HSalat commented 1 year ago
  • [ ] Activation rates: these tell us the probability that an employed (resp. unemployed) person is currently actively seeking a new job at any given timestep. Not in SPC but could re-use the national values we calculated.

We should already have employed/unemployed status, but need to check, will add if not. Do not know if are currently seeking or not, will check if can infer from currently used data sources, likely not.

  • [ ] Income bounds: values for the minimum and maximum possible annual income. Should be possible to pull this from SPC

Minimum salary is legal min, maximum is artificially capped (highest decile was not precise in the source data). We don't have a systematic method to assign highest salaries, left to user to do if needed.

  • [ ] Age distribution: contains observations of individual’s age (yearly granularity) in the population. Should be possible to pull this from SPC

Yes.

  • [ ] Consumption preference distribution: contains observations of consumption preference value (calculated as 1 – fraction of time spent working) in the population. Should be possible to pull fraction of time spent working from SPC to calculate this

The question is not clear to us.

  • [ ] Income distribution: tells us the mean and standard deviation for annual incomes associated with job positions with a given combination of industry, occupation, and geographical region. SPC income data appears to be split by occupation/region – is there a way to account for industry as well?

Industry is taken into account through SOC (4-digits), but not SIC. This is built in the source data.

  • [ ] Position distribution: contains observations of individual jobs within the UK economy, where each job is identified by its industry, occupation, and geographical region. SPC generates a population of jobs by occupation and industry for each region, so this should be possible to generate.

There is a reference file containing all positions at LSOA level in England (UK?). The assignment process is not extremely precise, bc it's not the purpose of the tool, but can be easily replaced by any other assignment method if required.

  • [ ] Labour flow networks: 3 matrices containing values describing the relatively frequencies of individuals switching to a job in a different geographical region, different industry, or different occupation. Within these matrices the rows are where the individual is switching from, the columns are where the individual is switching to, such that entry (i,j) tells us about switches from the ith region (or industry, or occupation) to the jth region (or industry, or occupation). Values are generated by dividing the number of switches from i to j by the total number of switches across all cells. This would need to be introduced to the SPC (with some downscaling applied to match spatial granularity).

We don't have that, keen on introducing it if relevant source data can be found.

  • [ ] Age-specific survival probabilities: probabilities of surviving to the next age (yearly granularity). Not in SPC but could reuse the national values.

Not in SPC, but computed for another project, can be added if necessary.

  • [ ] Input-output table: indicates interdependence between different industries within the UK, used to determine how similar industries are. Not in SPC, could re-use the national values (or possibly explore whether these data are available at a finer spatial granularity)

No plan to introduce this ourselves.

  • [ ] Skills data: used to determine how similar occupations are (based on their skills requirements), obtained from O*NET. Not in SPC, can re-use our current skills data (possible re-calc depending on SOC granularity).

Idem.

  • [ ] Distances between all region pairs: used to determine similarity between geographical regions – currently calculated as great circle distance between largest population centre within each region. Not in SPC, but presumably very easy to calculate? Also open to tweaks to the methodology here.

SPC has full population distributions at MSOA level (OA in next update), so it's a good source for this type of work, example if want to weigh by specific socio-economic characteristics.

mfbenitezp commented 1 year ago

@oguerrer we are on the way to develop a new and better version of SPC model and outputs, in that direction we took this issue which list the request of variable you and Kat sent me. Here @HSalat has provided a response to each requested variable. Would you mind taking a look at the above comments/questions, there are some attributes already included in the current outcomes.

mfbenitezp commented 1 year ago

In conversation with @oguerrer and Kat Fair we decided to hold this issue for next year, so no need to keep it open.