nichollsbr / cmsc611-project

0 stars 0 forks source link

Make sure headers of csv files are not being included in rdd/dataframe #11

Closed nichollsbr closed 5 years ago

nichollsbr commented 5 years ago

Right now, the analytics are pulling in data from multiple csvs. Need to make sure the RDD and the dataframe are not including the headers are raw data.

nichollsbr commented 5 years ago

cleanup.sh will unpack the noaa file and clean it up. This job takes a while. unpack_noaa was updated with instructions. Use gsod_csv.zip for the correct data.