Validating detections from batch, not sure how. Not an issue.

annamac80 commented 5 years ago

Hi guys

I have used a slightly modified version of your Batch processing script, see below. But I can't work out how to view and manually validate the detections (viewing and listening). It outputs a detections.csv but is there a way to get this back into R to be able to do something like showPeaks() and possibly eventually bindEvents() .

Sorry to post here, but I have looked and looked and can't seem to find the answer :)

Anna

setwd(choose.dir())  
surveys<-list.files(,pattern='\\.wav')
# Select the template file you would like to use
temps <- readBinTemplates(dir="Z:/MonitoR/templates/Swifty") 

# A loop to 'batch' detect these two templates in all surveys
# Make a directory to store plots
dir.create('plots')
# Make a directory to store detections
dir.create('detections')
# Nice to have progress feedback, but optional.
begin.t<-Sys.time()
for(i in surveys) {
    # Log survey detection start time
    s.start <- Sys.time()    

    # Perform the survey matching
    scores <- binMatch(
        survey=i,
        templates=temps,
        time.source="fileinfo"
    )

    # Convert scores to peaks
    pks <- findPeaks(score.obj=scores)

    # Extract the detections
    detects <- getDetections(pks)

    # Add the survey name 
    detects <- cbind(survey=surveys[1], detects)

    # Remove the file extension from the name
    no.ext <- gsub("\\.wav", "", i)

    # Save detections as a csv
    write.csv(detects, paste0('detections/', no.ext, '.csv'), row.names=FALSE)

    # Plot a visual record to the local disk (optional)  get rid of this for this one
    #png(filename=paste0("plots/", no.ext, "%03d.png"), width=1000, height=700, pointsize=10)
    #plot(pks, hit.marker='points')
    #dev.off()

    # Prepare a survey status report and send to the console
    survey.t<-format(difftime(Sys.time(),s.start,units='m'))
    cat("Done with",i,survey.t,'elapsed,',which(surveys==i),'of',length(surveys),'\n')
}
# Report the total time to do matching in all surveys
format(difftime(Sys.time(),begin.t,units='m'))

# END

muddynat commented 5 years ago

Hi Anna, I'm not sure if this is exactly what you're looking for, but hopefully you can adapt it to your needs. I wrote this function over a year ago to turn detections into an annotation file to be imported back in and visualized.

toAnnot() is used to turn template detections into an annotation .csv file so the detections can be visually observed against the survey spectrogram, primarily as a way to confirm/reject the detections. Arguments: detections - Required. A class data.frame object of detections, the output of getDetections(). file - Required. A character string defining the output file name (and its path). Needs to be in quotes and end in ‘.csv’. trainingAnno - Optional. If specified, a data frame object from an imported .csv annotation file. If included, this will produce a more accurate estimate of the call time and frequency ranges. toAnnot <- function(detections, file, trainingAnno = NULL) { #function to write detections from survey object to an annotation file that can be pulled in to visualise the detections on the survey file if(nrow(detections) == 0) { ndf <- data.frame(start.time = '', end.time = '', min.frq = '', max.frq = '', name = '') write.csv(ndf, file, row.names = F) } if(nrow(detections) != 0) { annot <- data.frame() detections <- as.data.frame(detections) annot[nrow(detections),] <- NA annot$start.time <- detections$time annot$end.time <- annot$start.time + ifelse(is.null(trainingAnno), 0.5, mean(trainingAnno[,2]-trainingAnno[,1])) annot$min.frq <- ifelse(is.null(trainingAnno), 0.5, mean(trainingAnno[,3])) annot$max.frq <- ifelse(is.null(trainingAnno), 5, mean(trainingAnno[,4])) annot$name <- detections$template write.csv(annot, file, row.names = F) return(annot) } }

sashahafner commented 5 years ago

Our batch functions are pretty simple, and I can see why they would not be sufficient. As we wrote in the help file:

 These functions are simple but do not provide flexibility in how
 results are handled. Manually writing a ‘for’ loop is a more
 flexible solution.

And I'm glad you are giving that a try :)

Anyway, to answer your question, you could save your detailed output (e.g., from findPeaks) in a list, e.g., before the loop:

pklst <- list()

then and in the loop

pklst[[i]] <- findPeaks(score.obj=scores)

Then you would have all the results in pklst, and can work with any list element with e.g., viewPeaks(), even in a loop if you like.

jonkatz2 commented 5 years ago

Another option is to save the output as RDS rather than csv. Then you can readRDS them back in later and do any manual verification.

setwd(choose.dir())  
surveys<-list.files(,pattern='\\.wav')
# Select the template file you would like to use
temps <- readBinTemplates(dir="Z:/MonitoR/templates/Swifty") 

# A loop to 'batch' detect these two templates in all surveys
# Make a directory to store plots
dir.create('plots')
# Make a directory to store detections
dir.create('detections')
# Nice to have progress feedback, but optional.
begin.t<-Sys.time()
for(i in surveys) {
    # Log survey detection start time
    s.start <- Sys.time()    

    # Perform the survey matching
    scores <- binMatch(
        survey=i,
        templates=temps,
        time.source="fileinfo"
    )

    # Convert scores to peaks
    pks <- findPeaks(score.obj=scores)

    #### SAVE OUTPUT FOR LATER ####
    # Swap the file extension for .RDS
    outfile <- gsub("\\.wav", ".RDS", i)

    # Save detections as RDS
    saveRDS(pks, outfile)
    #### 

    # Plot a visual record to the local disk (optional)  get rid of this for this one
    #png(filename=paste0("plots/", no.ext, "%03d.png"), width=1000, height=700, pointsize=10)
    #plot(pks, hit.marker='points')
    #dev.off()

    # Prepare a survey status report and send to the console
    survey.t<-format(difftime(Sys.time(),s.start,units='m'))
    cat("Done with",i,survey.t,'elapsed,',which(surveys==i),'of',length(surveys),'\n')
}
# Report the total time to do matching in all surveys
format(difftime(Sys.time(),begin.t,units='m'))

# END

annamac80 commented 5 years ago

Great thanks so much everyone, some really good suggestions. I'll give them a go and let you know what I end up going with.

Anna

MichelleThompson86 commented 4 years ago

Hi, I have been trying to figure out how to run analyses in batches and this loop code has been helpful. Sometimes it works for me and sometimes it doesn't (using the exact same code and files without making any changes - which I find odd). When it errors it says:

Error in data.frame(..., check.names = FALSE) : arguments imply differing number of rows: 1, 0

Any ideas? I am using Anna's code from her original post. Thanks in advance.

sashahafner commented 4 years ago

It is hard to say without knowing which specific line causes the error. You can find that and debug by running the commands in the body of the loop one by one after you get the error (i.e., keep the value of i that gives an error). Or submit > traceback() after you see the error. I wonder if it is the cbind() command that gives an error the problem because there are no detections.

MichelleThompson86 commented 4 years ago

It was the cbind()! Thanks, I just took that part out and later coded to post hoc combine the results of the individual csv files into one csv. Works great. Thanks again Michelle

jonkatz2 / monitoR

Validating detections from batch, not sure how. Not an issue. #12