JuliaHealth / IPUMS.jl

A convenience package to work with IPUMS data
http://juliahealth.org/IPUMS.jl/
MIT License
2 stars 1 forks source link

[FEATURE] Create a blog post to explain the structure of an IPUMS extract #26

Open 00krishna opened 4 months ago

00krishna commented 4 months ago

Issue Description

Write a blog post that explains the structure of basic IPUMS data extracts. IPUMS extracts come in different forms. The most basic extract form involves downloading a DDI (.xml) file and a data DAT compressed archive. The DDI file contains metadata about the variables in the extract--such as the variable names, data types, data ranges, etc. The DAT file contains only a fixed width format of numbers--never text.

The second type of extract is the NHGIS file, which contains a shapefile (.shp) containing both the GIS map of the selected geometries (city, state, county, etc) and data variables, and a CSV file containing just the variable information per geometric unit.

The post should explain the format of the extracts and the information contained in each component. The intent is that in subsequent blog posts, the author can explain the code for extracting information from these files without having to explain the structure of the extract at the same time.

Difficulty: Beginner

Time: 6 - 8 hours

Requirements

Expected Outcomes

The anticipated outcome is a blog post, written in Markdown, that contains the elements listed above. This blog post is more informative and non-technical, so there is no reason to show a lot of code. Using code and the IPUMS.jl package will come in a subsequent blog post.

Additional Notes

Additional information about the structure of IPUMS extracts is available on the IPUMS website. Some good sources of information include.

Other Resources

Julia Slack:

Julia Discourse - I would advise posting here if you have an issue that you feel is long or requires a lot of time to explain as you might lose it within Julia Slack. Consider cross-posting your forum post to the Julia Slack in helpdesk and/or documentation.

TheCedarPrince commented 4 months ago

This looks like a good write-up @00krishna. I actually think this might be better to put into the documentation of the package versus a separate blog post on the JuliaHealth website. What's your opinion Krishna?

00krishna commented 3 months ago

@TheCedarPrince Sure, this can go directly into the package documentation as a tutorial. That sounds fine to me.