DistanceDevelopment / spatial-workshops

Distance sampling workshop content
http://distancesampling.org/workshops/
2 stars 0 forks source link

Ensuring that the data get into R correctly #17

Closed dill closed 8 years ago

dill commented 8 years ago

How can folks ensure that they loaded the gdb file correctly and that the data in it are correct?

dill commented 8 years ago

In the exercises I've used plot and head to try to show folks that the data is loaded correctly. My theory being if it's really messed-up it won't load. If it's a bit messed-up then they will be able to look at the plots in Arc and R and see if there's something weird happening.

Any other tips? (Don't really want to delve into calculating summary statistics etc.)

jjrob commented 8 years ago

plot is good; they will be used to seeing the data in Arc, so if it looks similar in a plot that is a good sign. It is also important to check the names, types, and values of the fields, to make sure that the Arc → R process did not mangle anything. Typical problems include:

head is good for that, but I often use summary instead, to look for min and max values and how many NAs there are. I also use class and typeof but this should probably not be necessary.

dill commented 8 years ago

I've added some more chat to the first practical about this in 653f2d7.

My advice:

I've ignored the NA issue you raise as we're not dealing with that package. Can include if you think it's a major issue.