camrinbraun / tags2etuff

An R package for converting the hugely variable formats of animal tag data to a flat file format called eTUFF
Other
2 stars 2 forks source link

idvar must uniquely identify records #39

Closed marosteg closed 8 months ago

marosteg commented 8 months ago

I've been generating eTUFFs for a set of superficially identical deployment files but one is yielding an error. MWE:

devtools::load_all('~/Dropbox/GitHub/tags2etuff')
library(data.table)
library(lubridate)

#Read in metadata
meta <- read.csv('add_blueshark_meta.csv',header=T)
## format dates
meta$time_coverage_start <- lubridate::parse_date_time(meta$time_coverage_start, orders='Ymd HMS', tz='UTC')
meta$time_coverage_end <- lubridate::parse_date_time(meta$time_coverage_end, orders='Ymd HMS', tz='UTC')

dir <- '~/Google Drive/My Drive/Shark_data_MCA/' # check your wd (you have access already)
fish <- list.files(dir)

z = 4 # the problem deployment

fishID <- fish[z]
  data.dir <- paste(dir,fishID,sep="")

  # Generate etuff w/ track
  etuff <- tag_to_etuff(dir=data.dir, meta=meta[which(meta$ptt == fishID),], gpe3 = T)

...which leads to the following output:

[1] "Getting obsTypes..."
[1] "Reading Wildlife Computers popup or archival tag"
[1] "Getting PDT data..."
Error in reshapeLong(data, idvar = idvar, timevar = timevar, varying = varying,  : 
  'idvar' must uniquely identify records
In addition: Warning message:
In extract.pdt(data) :
  PTT column is empty. It is automatically being filled with DeploymentID.
Called from: reshapeLong(data, idvar = idvar, timevar = timevar, varying = varying, 
    v.names = v.names, drop = drop, times = times, ids = ids, 
    new.row.names = new.row.names)

add_blueshark_meta.csv

camrinbraun commented 8 months ago

This is not an eTUFF issue but rather a data issue. The "DeployID" column in most of the .csv files includes "Mako #5 2011" which is not very programmatically friendly. The hash is causing improper data read when the errors are traced from tag_to_etuff() -> read.wc(). Using read.wc() we see that the .csv file is read but it contains essentially no data. Simply revising the DeployID column to anything programmatically-friendly (such as copying the PTT column, as DeployID is not used) solves the issue and eTUFF runs as expected. I've uploaded the modified .csv files to the Drive.