zhrandell / Seattle_Aquarium_CCR_analytical_resources

This is a public repository to organize information pertaining to the cleaning, analysis, and visualization of ROV telemetry and spatial data, as well as preliminary information related to the % cover analyses (via CoralNet) of image stills derived from ROV video.
8 stars 0 forks source link

Working with Ping data to filter sample units (rows) down to survey transects #6

Closed zhrandell closed 1 year ago

zhrandell commented 1 year ago

Notes for myself that I need to:

zhrandell commented 1 year ago

Okay, @m-h-williams, I've made a little progress here. There's a couple quirks about how we need to select data based on altitude such that (for now) I took a shortcut and (gasp) directly edited the .csv file to trim it down to our exact transects. This is a bit of a faux pas, but I wanted to keep us moving here, so alas.

Once we have individual .csv files for each transect, we can visualize the data, such as, e.g.,


You can see the numerous erroneous ping altitude records. To clean these up, we can run the following:

dat <- dat %>%
mutate(dat$smoothed <- ifelse(dat$avg_dist > 1.5,

which removes the erroneously large values and replaces them with NA, like this:

  avg_dist avg_conf smoothed
1  1.48   100.00   1.48
2  1.46   100.00   1.46   
3  4.97   100.00   NA
4  5.21   100.00   NA
5  1.43   100.00   1.43
6  1.44   100.00   1.44

We can then run na.approx() from library(zoo) to interpolate the missing values, like this:

dat <- dat %>%
  mutate(smoothed = na.approx(smoothed))

NOTE that na.approx() requires real values on either end of a NA . . . a dataframe can't end (or begin) with NA for the interpolation to work.

Running na.approx() on the above data produces:

  avg_dist avg_conf smoothed
1  1.48   100.00   1.48
2  1.46   100.00   1.46   
3  4.97   100.00   1.45
4  5.21   100.00   1.44
5  1.43   100.00   1.43
6  1.44   100.00   1.44

and when applied to the whole dataframe, produces the following: trimmed_with_errors_interpolated

zhrandell commented 1 year ago

Closing this for now, though I'll note that we never came up with a true "rules based" method of filtering altitude values based on, e.g., rates of change along a rolling window. Rather, we set an arbitrary 1.5m above which values are tossed out. This is somewhat defensible given the vehicle does not exceed an altitude of 1.5m (ideally not > 1.2m) during surveys, though a more elegant solution would be preferred.