NewGraphEnvironment / fish_passage_skeena_2023_reporting

https://newgraphenvironment.github.io/fish_passage_skeena_2023_reporting
Creative Commons Zero v1.0 Universal
0 stars 2 forks source link

photo descrepencies and errors #10

Open NewGraphEnvironment opened 10 months ago

NewGraphEnvironment commented 10 months ago

here are what the dupes look like after the first round of duplicates are removed which we can see by running fpr::fpr_photo_remove_dupes then moding the function by removing the call to distinct and file.remove from the function on a second run of the function

image

test_bfpr_photo_remove_dupes <- function(dir_target = NULL,
                                   col_time = date_time_original,
                                   col_photo_name = source_file,
                                   col_model = model,
                                   col_iso = iso,
                                   remove_renamed = FALSE,
                                   ...){
  dat1 <- exifr::read_exif(path = dir_target,recursive=T)  %>%
    janitor::clean_names()  %>%
    dplyr::group_by( {{ col_model }}, {{ col_iso }}, {{ col_time }}) %>%
    dplyr::filter(dplyr::n()>1) %>%
    dplyr::mutate(l = stringr::str_length( {{ col_photo_name }} ))

  if(remove_renamed){
    dat2 <- dat1 %>%
      dplyr::arrange( {{ col_time }}, {{ col_model }}, {{ col_iso }}, desc(l))
  }else if(identical(remove_renamed, FALSE))
    dat2 <- dat1 %>%
      dplyr::arrange( {{ col_time }},{{ col_model }}, {{ col_iso }}, l)

  # dat3 <- dat2 %>%
  #   dplyr::distinct( {{ col_time }}, .keep_all = T)

  # dat3 %>%
  #   dplyr::pull( {{ col_photo_name }}) %>%
  #   purrr::map(file.remove)

}
NewGraphEnvironment commented 10 months ago

improvements now made to fpr::fpr_photo_remove_dupes

to see the photos that have duplicates and triplicates and the difference before and after running actual removals we can load a .RDATA file and inspect all the objects it brings in. Restarting R (control + command + 0) on mac beforehand can be handy. We should be careful to not rerun fpr::fpr_photo_remove_dupes with the dry_run = F.

image