bcgov / wqbench

R package to generate download and compile data from EPA ECOTOX database
Apache License 2.0
3 stars 2 forks source link

which duration column to use #8

Closed aylapear closed 1 year ago

aylapear commented 1 year ago

Tests table

Results table

Currently using the obs_duration_unit

aylapear commented 1 year ago

Angeline said

aylapear commented 1 year ago

Updated compile function to do the switch and make a general duration column from the two before joining it to code description table:

 # add duration unit info
    # use study duration but if study duration is missing use observed duration
    dplyr::mutate(
      duration_mean = dplyr::if_else(
        !is.na(.data$study_duration_mean),
        .data$study_duration_mean,
        .data$obs_duration_mean
      ),
      duration_unit = dplyr::if_else(
        !is.na(.data$study_duration_mean),
        .data$study_duration_unit,
        .data$obs_duration_unit
      ),
      duration_unit = dplyr::if_else(
        !stringr::str_detect(.data$duration_mean, "NC"),
        .data$duration_unit,
        .data$obs_duration_unit
      ),
      duration_mean = dplyr::if_else(
        !stringr::str_detect(.data$duration_mean, "NC"),
        .data$duration_mean,
        .data$obs_duration_mean
      )
    ) |>
    dplyr::left_join(
      db_duration_unit_codes, by = c("duration_unit" = "code")
    ) |>