Dr. Wiggins's instructions from the syllabus (with key pts in bold):
Your data cleaning and documentation draft will be part of your final project repository; this is
due by Week 10 to help provide early feedback. Take advantage of the opportunity by submitting
as complete a document as you can muster! If you are integrating multiple data sources, provide
(shorter) background details for each data set.
At a minimum, the documentation must include:
1-2 paragraph text description of the data source/s (how much, where from, what it contains, etc.) with a properly formatted citation for each data source.
Specifically identify any intellectual policy constraints, or lack thereof (licensing).
1 paragraph description of the metadata: what information is available to help you interpret
and understand the data?
Identify any issues you have encountered with the data: missing values, unstandardized con-
tent, entity matching, etc.
1 paragraph description of your rationale for the steps you're taking to remediate data. For
example, if you need to fill in empty fields, specify what value you chose and why.
A script or step-by-step textual description (or a combination) that documents your data
cleaning process with enough detail for replication.
This deliverable supports timely feedback for work-in-progress. Since most of you will use data
that is much "cleaner" than you would normally encounter in the wild, incomplete data processing
documents are acceptable only if you can clearly identify the barriers (or series of barriers) to
completion, which will help us help you troubleshoot. Any issues highlighted by instructor feed-
back should be carefully attended to for your final project data processing documentation. This
document can take several forms (R script, Markdown, word processing), so choose the best one
for your project needs and submit the URL on Canvas.
Due week 10
Dr. Wiggins's instructions from the syllabus (with key pts in bold):