Open FanWangEcon opened 1 year ago
@szkaifeng
Generate several figures, Map, given the 2020 demographic data:
Given the 2020 vs 2010 demographic data, matching counties where possible, hopefully
Published book [中国2000年人口普查分县资料] https://bbs.pinggu.org/thread-433655-1-1.html; https://data.casearth.cn/sdo/detail/5c19a5670600cf2a3c557af9; https://github.com/leiii/census/tree/main/data/census (2010,2020)
We have identified 2020, and 2010 census data, with county-specific breakdown of gender $\times$ age shares.
@marcomlaghi
@marcomlaghi
@marcomlaghi and @szkaifeng
The shapefile corresponding to each census is different, we do not need harmonized shape files, but do need shape files to show county-boundaries in each census year.
The information we need from these shapefiles is basically which $0.25 \text{km} \times 0.25 \text{km}$ square (or other smallest unit of geographic climatic data) corresponds to which county in which year. So that we can link climate data with population data.
Note: [IPUMS IHGIS] do have 1982, 1990, 2000, county level shpfile by population and age and sex
1990 county data from sedac https://sedac.ciesin.columbia.edu/data/set/cddc-china-population-census-and-agriculture
a decent public harmonized shapefile for censuses 1 to 6 https://www.scidb.cn/en/detail?dataSetId=849628989872930816 A following work could be manually adding the age group by gender by county to the county.
Haoran Wu, Liang Gao, Dongdong Song, et al. A dataset of district/county-level population distribution of China’s six national censuses[DS/OL]. V2. Science Data Bank, 2022[2023-06-22]. https://cstr.cn/31253.11.sciencedb.j00001.00273. CSTR:31253.11.sciencedb.j00001.00273.
Haoran Wu, Liang Gao, Dongdong Song, et al. A dataset of district/county-level population distribution of China’s six national censuses[DS/OL]. V2. Science Data Bank, 2022[2023-06-22]. https://doi.org/10.11922/sciencedb.j00001.00273. DOI:10.11922/sciencedb.j00001.00273.
1990 county data from sedac https://sedac.ciesin.columbia.edu/data/set/cddc-china-population-census-and-agriculture
here is a link to the label dictionary as well: https://citas.csde.washington.edu/data/chinaA/datasets.htm
2000s Census Data
County Shapefile from here: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/VKGEBX&version=1.0
Age-Gender Census counts from here: https://dataverse.harvard.edu/dataverse/chinacensus
Data is from China Data Online but these versions are preferred as this allows download province by province, while download direct from CDO appears to need downloading prefecture by prefecture.
Next step: After I check to make sure IPUMS does not provide this data already, I will try and combine the datasets as follows...
Location id (county code); gender; age groups grouped; year by year age (~700,000 rows)
Documented 2000 Census data as follows:
Used metadata information to create text files which I then inserted commas into to create CSV files of each province's counties including county code, English name and Chinese name. After cleaning, I was able to merge these 31 new files to each of the 31 province's census data from the dataverse/China Data Online 2000 data, using the countys' English names. By doing this I was able to make sure names were not repeated or misattributed, also it seems like the data often had an additional name added for common county names to further distinguish them. I took these 31 merged files and merged/compiled them into one.
Files saved on Dropbox: ./marco_laghi/CensusUpload2000
@szkaifeng
Obtain China County-level population data that shows: the joint distribution of gender and age at the county level, in multiple years if possible.
Note:
Potential data sources: