Metropolitan-Council / tc.sensors

Package with functions to pull sensor data, sensor IDs, and sensor configuration for MnDOT metro district
https://metropolitan-council.github.io/tc.sensors
Other
1 stars 0 forks source link

pull_sensor function: integrate a function to gapfill data #2

Closed ashleyasmus closed 4 years ago

ashleyasmus commented 4 years ago

would be great if pull_sensor automatically gap-filled small numbers of missing data

I tried to do this using "frollapply" in some scrappy code but it is slow and doesn't work so well

for (i in 1:n) {
    loops_ls[[i]] <- tc.sensors::pull_sensor(j, date_range[[i]])
  }

  loops_df <- data.table::rbindlist(loops_ls)

  # This part me - gapfill:
  library(data.table)
  loops_df[,date:=as.IDate(date)]
  setorder(loops_df, date)
  loops_df[,year:=year(date)]
  # gapfill - hours -- i don't think this works? 
  if(nrow(loops_df[is.na(volume)])>0){
    loops_df[,`:=`(volume.rollmedian.hour = shift(frollapply(volume, 2*60, median, align = 'center', na.rm = T, hasNA = T))),
             by = year]
    # loops_df[,`:=`(occupancy.rollmedian.hour = shift(frollapply(volume, 2*60, median, align = 'center', na.rm = T, hasNA = T))),
    #        by = year]

    loops_df[,volume:=ifelse(is.na(volume), volume.rollmedian.hour, volume)]
    # loops_df[,occupancy:=ifelse(is.na(occupancy), occupancy.rollmedian.hour, occupancy)]

    loops_df[,c('volume.rollmedian.hour'):=NULL]
eroten commented 4 years ago

@ashleyasmus do you think we should integrate functions for aggregating up to hours/days/ etc.?

eroten commented 4 years ago

This issue is closed with the introduction of the fill_gaps parameter in pull_sensor(). If you are aggregating the raw data, you will want to be sure to use fill_gaps = TRUE when pulling the raw data. If there are no gaps in the raw data, there won't be gaps when you aggregate up.

59fac22b1eeab7780ee30e414daad9da84e0f08e