leppott / MBSStools

Suite of tools for data manipulation and calculations for Maryland DNR MBSS program.
https://leppott.github.io/MBSStools/
GNU General Public License v3.0
3 stars 3 forks source link

Exclude column not working #31

Closed leppott closed 3 years ago

leppott commented 4 years ago

Describe the bug The exclude column does not seem to be working. I have tried using the Y/N and TRUE/FALSE designations, but all taxa are being counted in the metrics. This was not a problem when I ran these data in the Fall. Regardless of how this column is formatted, the code does run. Invalid input in this column should trigger a fatal error.

To Reproduce Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior A clear and concise description of what you expected to happen.

Screenshots NA

Additional context Add any other context about the problem here.

leppott commented 4 years ago

Ensure No and Yes are handled by QC check in metric.values. (line 203)

leppott commented 4 years ago

Related to Issue #30

leppott commented 4 years ago

One of the special QC checks incorrectly was converting values to FALSE. Fixed in metric.scores.

Example data "CHGX-432-S-1111" has 29 entries with 2 marked as EXCLUDE = FALSE.

Should be ntaxa = 27 and not 29. Fixed now with update v1.1.0.9021

leppott commented 4 years ago

v1.1.0.9021

https://github.com/leppott/MBSStools/commit/3a871ac554fd38777d4ae560d17bb33a99c003b0

leppott commented 3 years ago

Code for checking different scenarios. Should work in all cases now.

Code from metric.values.R, line 239

TRUE = Y, YES

FALSE = N, NO, NA, "NA", "", null


  # fix for common non-standard entries.
  qc_col   <- "EXCLUDE"
  myDF[, qc_col] <- toupper(as.character(myDF[,qc_col]))
  # Use grepl to check otherwise fails if do a normal subset
  # myDF[myDF[, qc_col] == "Y", qc_col] <- TRUE # This fails of non present
  myDF[grepl("Y", myDF[, qc_col]), qc_col] <- TRUE
  myDF[grepl("YES", myDF[, qc_col]), qc_col] <- TRUE
  myDF[grepl("N", myDF[, qc_col]), qc_col] <- FALSE
  myDF[grepl("NO", myDF[, qc_col]), qc_col] <- FALSE
  myDF[myDF[, qc_col] =="", qc_col] <- FALSE
  myDF[grepl("NA", myDF[, qc_col]), qc_col] <- FALSE
  myDF[is.null(myDF[, qc_col]), qc_col] <- FALSE
  myDF[is.na(myDF[, qc_col]), qc_col] <- FALSE
  # Valid values are: TRUE and FALSE
  qc_col   <- "EXCLUDE"
  qc_val   <- c("TRUE", "FALSE")
  qc_user  <- unique(myDF[, qc_col])
  qc_check <- qc_user %in% qc_val
  qc_invalid <- qc_user[!qc_check]
  if(length(qc_check) != sum(qc_check)){
    myMsg <- paste0("\nBad values in ", qc_col, ".\n Valid: \n  "
                    , paste(qc_val, sep= "", collapse = ", ")
                    , "\n Invalid: \n  "
                    , paste(qc_invalid, sep = "", collapse = ", ")
                    , collapse="")
    stop(myMsg)
  }## IF ~ QC, Strata ~ END
  # move logical after check
  myDF[, qc_col] <- as.logical(myDF[, qc_col])```
leppott commented 3 years ago

Tweaked QC by moving logical to end of section. Added NA, "NA", and "".

v1.1.0.9052 https://github.com/leppott/MBSStools/commit/d604a159693c6b2e6e0aaf4fd19826067c03fe8b