mcanouil / eggla

Early Growth Genetics Longitudinal Analysis
https://m.canouil.dev/eggla/
Other
2 stars 1 forks source link

Archives not created `run_eggla_lmm()` #41

Closed burrowsk closed 2 years ago

burrowsk commented 2 years ago

Bug description

I can see the various tables and images created from the archives function (line 152 run_eggla_lmm()) appear in the male and female directories that are created (line 157). however, once the run_eggla_lmm() function is completed for each sex the files are automatically removed. I believed this was due to line 158 r on.exit(unlink(results_directory, recursive = TRUE)).
Commenting this line (158) resulted in the files being retained.

I'm working on RStudio version 1.4.1717 on Windows 10.

related code block:

 archives <- sapply(
    X = c(0, 1),
    FUN = function(isex) {
      sex_literal <- c("0" = "male", "1" = "female")[as.character(isex)]
      results_directory <- file.path(working_directory, sex_literal)
      dir.create(results_directory, recursive = TRUE)
      on.exit(unlink(results_directory, recursive = TRUE))
      results <- egg_model(
        formula = base_model,
        data = dt_clean[egg_sex %in% isex],
        id_var = "egg_id",
        random_complexity = random_complexity,
        use_car1 = use_car1,
        knots = knots,
        quiet = quiet
      )

      saveRDS(
        object = results,
        file = file.path(
          working_directory,
          sprintf("%s-%s-model-object.rds", Sys.Date(), sex_literal)
        )
      )

      writeLines(
        text = deparse1(results$call),
        con = file.path(results_directory, "model-call.txt")
      )

      data.table::fwrite(
        x = broom.mixed::tidy(results),
        file = file.path(results_directory, "model-coefficients.csv")
      )

      grDevices::png(
        filename = file.path(results_directory, "model-residuals.png"),
        width = 4 * 2.5,
        height = 3 * 2.5,
        units = "in",
        res = 120
      )
      print(
        plot_residuals(
          x = x_variable,
          y = y_variable,
          fit = results
        ) +
          patchwork::plot_annotation(
            title = sprintf(
              "Cubic Splines (Random Linear Splines) - BMI - %s",
              c("0" = "Male", "1" = "Female")[as.character(isex)]
            ),
            tag_levels = "A"
          )
      )
      invisible(grDevices::dev.off())

      data.table::fwrite(
        x = egg_slopes(
          fit = results,
          period = period,
          knots = knots
        ),
        file = file.path(results_directory, "derived-slopes.csv")
      )

      data.table::fwrite(
        x = egg_aucs(
          fit = results,
          period = period,
          knots = knots
        ),
        file = file.path(results_directory, "derived-aucs.csv")
      )

      data.table::fwrite(
        x = egg_outliers(
          fit = results,
          period = period,
          knots = knots
        ),
        file = file.path(results_directory, "derived-outliers.csv")
      )

      eggc <- egg_correlations(
        fit = results,
        period = period,
        knots = knots
      )

      data.table::fwrite(
        x = eggc[["AUC"]],
        file = file.path(results_directory, "derived-aucs-correlations.csv")
      )
      data.table::fwrite(
        x = eggc[["SLOPE"]],
        file = file.path(results_directory, "derived-slopes-correlations.csv")
      )

      owd <- getwd()
      on.exit(setwd(owd), add = TRUE)
      setwd(results_directory)
      archive_filename <- file.path(
        working_directory,
        sprintf("%s.zip", sex_literal)
      )
      utils::zip(
        zipfile = archive_filename,
        files = list.files()
      )
      archive_filename
    }
  )

  if (!quiet) {
    message("Results available at:")
    message(paste(sprintf("+ '%s'", archives), collapse = "\n"))
  }
  archives
}

eggla version output

‘0.11.2.9000’

Checklist

burrowsk commented 2 years ago

function call:

res <- run_eggla_lmm(
  data = fread(phenotypes_nopub.csv),
  id_variable = "ID",
  age_days_variable = "days",
  age_years_variable = "age",
  weight_kilograms_variable = "weight",
  height_centimetres_variable = "height",
  sex_variable = "sex",
  covariates = "sourceb",
  male_coded_zero = FALSE,
  random_complexity = 2,
  use_car1 = FALSE,
  parallel = TRUE,
  parallel_n_chunks = 5,
  working_directory = output_directory
)
mcanouil commented 2 years ago

Are the two archives still in the folder output_directory?

burrowsk commented 2 years ago

Yes the directories remain, they are just empty.

mcanouil commented 2 years ago

I meant the zip files? The zip archives are created at the root of output_directory (in your case) using two sub-directories "male" and "female". Only the zip archives are kept, while the two directories are deleted. I can add an option to keep the directories with the outputs in addition to the archives (which contain the same thing).

burrowsk commented 2 years ago

Sorry! no they are not, actually they don't appear to be created.

mcanouil commented 2 years ago

So, you don't have zip archives. If yes, then this is the real issue here.

burrowsk commented 2 years ago

That's right, I don't have the zip archives.


From: Mickaël Canouil @.> Sent: 10 August 2022 17:08 To: mcanouil/eggla @.> Cc: Kimberley Burrows @.>; Author @.> Subject: Re: [mcanouil/eggla] unlink on exit is removing all archives created in the male and female directories (run_eggla_lmm()) (Issue #41)

So, you don't have zip archives. If yes, then this is the real issue here.

— Reply to this email directly, view it on GitHubhttps://github.com/mcanouil/eggla/issues/41#issuecomment-1210931000, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACNZZ2KGDYMSIGHBCM7JHI3VYPHYTANCNFSM56FASKKA. You are receiving this because you authored the thread.Message ID: @.***>

mcanouil commented 2 years ago

@burrowsk I changed the way the archives were made. Could you check if:

  1. You had the issue also with the example (code below).

    # pak::pkg_install("mcanouil/eggla@main")
    remotes::install_github("mcanouil/eggla@main")
    
    library(eggla)
    library(data.table)
    data("bmigrowth")
    fwrite(
     x = bmigrowth,
     file = file.path(tempdir(), "bmigrowth.csv")
    )
    res <- run_eggla_lmm(
     data = fread(file.path(tempdir(), "bmigrowth.csv")),
     id_variable = "ID",
     age_days_variable = NULL,
     age_years_variable = "age",
     weight_kilograms_variable = "weight",
     height_centimetres_variable = "height",
     sex_variable = "sex",
     covariates = NULL,
     random_complexity = 1,
     working_directory = tempdir()
    )
  2. You do not have the issue anymore with the code below (restart R before).

    # pak::pkg_install("mcanouil/eggla@mcanouil/issue41")
    remotes::install_github("mcanouil/eggla@mcanouil/issue41")
    
    library(eggla)
    library(data.table)
    data("bmigrowth")
    fwrite(
     x = bmigrowth,
     file = file.path(tempdir(), "bmigrowth.csv")
    )
    res <- run_eggla_lmm(
     data = fread(file.path(tempdir(), "bmigrowth.csv")),
     id_variable = "ID",
     age_days_variable = NULL,
     age_years_variable = "age",
     weight_kilograms_variable = "weight",
     height_centimetres_variable = "height",
     sex_variable = "sex",
     covariates = NULL,
     random_complexity = 1,
     working_directory = tempdir()
    )
burrowsk commented 2 years ago

With action 1:

With action 2:

mcanouil commented 2 years ago

For safety/time, my changes included:

Can you try a more simple example to identify why the archive is not created? I suspect it won't work for you for some reason.

dir.create(file.path(tempdir(), "test_archive"))
cat("text", file = file.path(tempdir(), "test_archive", "test.txt"))
utils::zip(
  zipfile = file.path(tempdir(), "test.zip"),
  files = list.files(file.path(tempdir(), "test_archive"), full.names = TRUE),
  flags = "-r9Xj"
)
#> updating: test.txt (stored 0%)
list.files(tempdir(), "\\.zip$")
#> [1] "test.zip"
burrowsk commented 2 years ago

You are right it did not work! However, looking online it appears it may be due to Rtools. I put Rtools in my path as per https://cran.r-project.org/bin/windows/Rtools/rtools40.html and now it works. I have just tested both actions from earlier and now I have the archives saved to my Rtmp directory. I've unzipped these and the relevant files are present.

Thanks very much for your help, Windows OS can be so awkward!

mcanouil commented 2 years ago

This is what I was thinking about. To avoid any issues I made some additional changes (5dc0af60942b4fd6e95483c832659d073de54463) which will allow eggla to work on Windows even without Rtools. The users will have to archive the results themselves, without the model object.

burrowsk commented 2 years ago

Yes, I think that is the best approach. It will be easier for other cohort analysts to archive the results themselves than to mess around with Rtools.

mcanouil commented 2 years ago

I am merging #42