BiologicalRecordsCentre / wrappeR

0 stars 10 forks source link

How to extract metadata for different model structures #37

Closed 03rcooke closed 3 years ago

03rcooke commented 3 years ago

I have currently bodged this code so that it works for the multiple different model structures. The problem is that the metadata from models that were daisy-chained is located in different positions for different taxonomic groups, e.g., for bwars models the metadata is located at _2000_1, but for bryophytes it's located at _4000_1

This is the code that needs fixing within tempSampPost:

combineSamps <- function(species, minObs) { 
    # NJBI this function refers to several global variables, e.g. tn - not good practice
    #print(species)
    out <- NULL
    raw_occ <- NULL

    if(substr(first.spp, (nchar(first.spp) + 1) - 2, nchar(first.spp)) %in% c("_1", "_2", "_3")) {

      if(first.spp == "Bry_1_12000_1") { # THIS IS BAD CODING - but no easy way round it

        out_meta <- load_rdata(paste0(indata, species, "_4000_1.rdata")) # where metadata is stored for bryophyte JASMIN models 

      } else if(first.spp == "Abrothallus bertianus_10000_1") { # THIS IS BAD CODING - but no easy way round it

        out_meta <- load_rdata(paste0(indata, species, "_5000_1.rdata")) # where metadata is stored for lichen JASMIN models 

      }

      else {

        out_meta <- load_rdata(paste0(indata, species, "_2000_1.rdata")) # where metadata is stored for JASMIN models 

      }

    } else {

      out_dat <- load_rdata(paste0(indata, species, ".rdata"))
      out_meta <- out_dat

    }
AugustT commented 3 years ago

Try this

min <- 1000
max <- 20000
by <- 1000

list_of_file_names <- apply(expand.grid(LETTERS,
                                  as.character(seq(min, max, by = by)),
                                  1:3),
                      1,
                      paste,
                      collapse="_")

list_of_file_names <- paste0(list_of_file_names, '.rdata')

findMinIteration <- function(list_of_file_names){

  if(length(list_of_file_names) < 1) stop('list_of_file_names is empty')
  if(!is.character(list_of_file_names)) stop('list_of_file_names must be a character')

  # remove the last number and file extension
  # find '_' followed by a signal number and a '.' and remove
  # that and everything that follows
  # remove '\\..+' if there is no file extension
  list_of_file_names <- gsub('_[[:digit:]]{1}\\..+$', '', list_of_file_names)

  # Extract the iterations number
  iterations <- regmatches(list_of_file_names, regexpr('[[:digit:]]+$', list_of_file_names))

  # Get minimum
  return(min(as.numeric(iterations)))

}

findMinIteration(list_of_file_names)