Right now, the analytics are pulling in data from multiple csvs. Need to make sure the RDD and the dataframe are not including the headers are raw data.
cleanup.sh will unpack the noaa file and clean it up. This job takes a while.
unpack_noaa was updated with instructions. Use gsod_csv.zip for the correct data.
Right now, the analytics are pulling in data from multiple csvs. Need to make sure the RDD and the dataframe are not including the headers are raw data.