cherrypi / Science-Fair_2019

Vernal Pond graphing and data, as well as data analysis.
1 stars 0 forks source link

Clean up your data - valid column names #4

Closed VCF closed 5 years ago

VCF commented 5 years ago

R data structures can have names. For data frames, matrices and 2D arrays these are found with rownames() and colnames(). There are somewhat esoteric rules on what's allowed for these names; Rather than diving into them, let's just look at where R is unhappy. Your current columns are named:

colnames(Pond_Data)
 [1] "Date"                "Rain.in."            "Depth.cm."          
 [4] "South"               "North"               "West"               
 [7] "East"                "Temperature..F..Max" "Temperature..F..Min"
[10] "Other.Observations" 

If we look at the "top" of your file (in a bash shell) we can see that R has been agitated over some of these:

head -n1 CSV_DataStorage.csv 
Date,Rain(in),Depth(cm),South,North,West,East,Temperature(*F) Max,Temperature(*F) Min,Other Observations

Compare those to see which characters R has replaced (with a ., which is an ok character). Clean up the column names so those used in the CSV file match what R uses after loading the data frame.

Your use of units in the column headers is admirable (it provides additional descriptive information for the file) but is generally not done. Rather, the data are often accompanied by a descriptive file called a manifest that describes your file. So take out the units, and we'll make a manifest in another issue.

VCF commented 5 years ago

You're close with b759e95 - but you're still using a character that's not considered "legal" in column headers. Again, use colnames() in R and head -n1 in bash to see what's different.

You may want to read about CamelCase - it's a handy way of representing multi-word variable names that's legal in most coding languages.

cherrypi commented 5 years ago

How again do you change directories in R?

VCF commented 5 years ago
VCF commented 5 years ago

Good, it looks like 4dc6205 has addressed the issue.