Open ummel opened 2 years ago
@ummel: Almost done with this part except for handling the trip/vehicle data. Do you think we should we merge the trip level data with the person files or create a separate file for trips? We need to update the create/compiledictionary functions to handle trip level data in that case.
The processed microdata need to be strictly household and/or person-level observations. I would summarize the trip data at household-level (since this seems most likely to be used, in practice) and merge with the household variables -- i.e. the data that will be saved as a H file.
Have pushed the files "NHTS_2017_P_processed.R" and the corresponding dictionary to survey-processed/NHTS/2017/ Also merged the trip and vehicle level data to the person level data since I thought it'd make analysis easier at a later stage. We could switch the merger to the household file if that makes more sense.
Create new .R script
/survey-processed/NHTS/2017/NHTS_2017_P_processed.R
. See analogous ACS processing script for use as template for code development.I think the only raw data feeding this script should come from the person-level NHTS data file. We want to keep the person-level data separate from the household data and retain maximum detail. The fusionData package is setup to smoothly handle both household- and person-level microdata from a given survey. In other words, there is no reason to prematurely/arbitrarily summarize person-level records at the household level.
Ensure that your new
NHTS_2017_P_processed.R
script contains the following code at the very end to commit the processed data dictionary and microdata to disk for year 2017.