Closed gvegayon closed 3 months ago
Hey @UofUEpiBio/phs7045-2023, if your final is about a simulation study to analyze the properties of an estimator, section 5 of this paper may be useful for you. This is a large simulation study I ran using slurmR
.
We need to find coefficients in a logistic regression to match a desired proportion. Here is how we do it:
# Target proportion
target_prop <- .05
set.seed(3123)
n <- 1e4
X <- cbind(rnorm(n))
# Objective function (loss function)
fun <- function(x) {
Y <- rbinom(100000, 1, prob = plogis(x[1] + X*x[2]))
(mean(Y) - target_prop)^2
}
# Finding the closest value
res <- optim(par = c(0, 0), fn = fun)
res
#> $par
#> [1] -4.472115 1.985507
#>
#> $value
#> [1] 9e-10
#>
#> $counts
#> function gradient
#> 71 NA
#>
#> $convergence
#> [1] 0
#>
#> $message
#> NULL
# Are we getting there?
message(
"Target is : ", target_prop, "\n",
"Proposed value: ", paste(res$par, collapse = ", "), "\n",
"We got : ", mean(rbinom(100000, 1, prob = plogis(res$par[1] + res$par[2] * X)))
)
#> Target is : 0.05
#> Proposed value: -4.47211456298828, 1.98550653457642
#> We got : 0.05035
Created on 2023-04-27 with reprex v2.0.2
data("USArrests")
mymodel <- function(...) {
res <- glm(...)
structure(
res,
class = c("mymodel", class(res))
)
}
ans <- mymodel(Murder ~ Assault + Rape, data = USArrests)
# There's no print method for "mymodel", but there is one for the next class
# which is "glm"
print(ans)
#>
#> Call: glm(formula = ..1, data = ..2)
#>
#> Coefficients:
#> (Intercept) Assault Rape
#> 0.41886 0.04003 0.02514
#>
#> Degrees of Freedom: 49 Total (i.e. Null); 47 Residual
#> Null Deviance: 929.6
#> Residual Deviance: 330.3 AIC: 244.3
summary.mymodel <- function(object, ...) {
ans <- stats::summary.glm(object)$coefficients
structure(
ans,
message = "mymodel summary:\n",
class = c("mymodel_summary", class(ans))
)
}
print.mymodel_summary <- function(x, ...) {
cat(attr(x, "message"))
y <- x
attr(y, "message") <- NULL
class(y) <- class(x)[2]
# Printing the coefficient only
print(y)
# But still returning invisibly the mymodel_summary object
invisible(x)
}
# This calls summary.mymodel
ans_summary <- summary(ans)
# This calls print.mymodel_summary
ans_summary
#> mymodel summary:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 0.41886275 0.976184443 0.4290816 6.698243e-01
#> Assault 0.04002885 0.006086735 6.5764086 3.593623e-08
#> Rape 0.02514178 0.054156886 0.4642398 6.446192e-01
# Since there's not subset method for mymodel_summary, it defaults to the
# next class, which is matrix:
ans_summary[,1,drop=FALSE]
#> Estimate
#> (Intercept) 0.41886275
#> Assault 0.04002885
#> Rape 0.02514178
Created on 2023-04-27 with reprex v2.0.2
Rcpp::sourceCpp(code = "
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericMatrix sigmoid_cpp(NumericMatrix x) {
// Notice I am using Rcpp::clone. With this, I am making sure that I get
// a full copy of x. If I used this instead:
// NumericMatrix ans(x)
// I would be getting a pointer to x.
NumericMatrix ans = Rcpp::clone(x);
for (size_t i = 0u; i < x.ncol(); ++i)
ans.column(i) = 1.0/(1.0 + exp(-ans.column(i)));
return ans;
}
")
set.seed(223)
x <- matrix(rnorm(9), ncol = 3)
(sigmoid_cpp(x) - plogis(x))
#> [,1] [,2] [,3]
#> [1,] 0 0 0
#> [2,] 0 0 0
#> [3,] 0 0 0
Created on 2023-04-27 with reprex v2.0.2
Rcpp::sourceCpp(code = "
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
using namespace Rcpp;
// [[Rcpp::export]]
arma::mat sigmoid_cpp(const arma::mat x) {
return 1.0/(1.0 + arma::exp(-x));
}
")
set.seed(223)
x <- matrix(rnorm(9), ncol = 3)
(sigmoid_cpp(x) - plogis(x))
#> [,1] [,2] [,3]
#> [1,] 0 0 0
#> [2,] 0 0 0
#> [3,] 0 0 0
Created on 2023-04-27 with reprex v2.0.2
Rcpp::sourceCpp(code = "
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
using namespace Rcpp;
// [[Rcpp::export]]
List sigmoid_cpp(const arma::mat x) {
List res = List::create(
_[\"ans\"] = 1.0/(1.0 + arma::exp(-x))
);
return res;
}
")
set.seed(223)
x <- matrix(rnorm(9), ncol = 3)
(sigmoid_cpp(x)$ans - plogis(x))
#> [,1] [,2] [,3]
#> [1,] 0 0 0
#> [2,] 0 0 0
#> [3,] 0 0 0
Created on 2023-04-27 with reprex v2.0.2
inst
folder of the package# Getting the full path of the myfile.slurm file, which is
# included in the R package under:
# inst/slurm_scripts/myfile.slurm
fn <- system.file(
file.path("slurm_scripts", "myfile.slurm"),
package = "epiworldR"
)
# fn now has the FULL PATH to the myfile.slurm file.
fn
#> [1] "/home/george/R/x86_64-pc-linux-gnu-library/4.2/epiworldR/slurm_scripts/myfile.slurm"
# And thus, we can read it!
readLines(fn)
#> [1] "#!/bin/sh" "#SBATCH --job-name=sapply"
#> [3] "#SBATCH --time=00:10:00" ""
#> [5] "module load R/4.2.2" "Rscript --vanilla 01-sapply.R"
Created on 2023-04-27 with reprex v2.0.2
The problem with OpenMP is that we can't generate random numbers as easily as we would do using R. Since R is not multi-threaded, we need to use other resources. In this example, we generate n
uniformly distributed random numbers using OpenMP.
#include <Rcpp.h>
#include <random>
#ifdef _OPENMP
#include <omp.h>
#endif
using namespace Rcpp;
// [[Rcpp::plugins(openmp)]]
// [[Rcpp::export]]
std::vector< std::vector<double> > parallel_rand(
int n,
int ncores,
int seed
) {
#ifdef _OPENMP
omp_set_num_threads(ncores);
#endif
// Setting the parallel engine -----------------------------------------------
std::vector< std::mt19937 > engines;
std::uniform_real_distribution<> dunif(0.0, 1.0);
engines.push_back(std::mt19937());
engines[0].seed(seed);
// We use the #ifdef _OPENMP to check whether OMP is supported
#ifdef _OPENMP
int effective_ncores = ncores;
#else
int effective_ncores = 1;
#endif
// Setting one seed per engine (based on the original)
for (int i = 1; i < effective_ncores; ++i) {
engines.push_back(std::mt19937());
size_t seed_i =
static_cast<size_t>(
std::floor(
static_cast<double>(std::numeric_limits<size_t>::max()) *
dunif(engines[0])
)
);
engines[0+1].seed(seed_i);
}
// Figuring out how many replicates ------------------------------------------
std::vector< size_t > nreplicates(effective_ncores, 0);
std::vector< size_t > nreplicates_csum(effective_ncores, 0);
size_t sums = 0u;
for (int i = 0; i < effective_ncores; ++i)
{
nreplicates[i] = static_cast<size_t>(
std::floor(n/effective_ncores)
);
// This takes the cumsum
nreplicates_csum[i] = sums;
sums += nreplicates[i];
}
if (sums < n)
nreplicates[effective_ncores - 1] += (n - sums);
// Making room ---------------------------------------------------------------
std::vector< std::vector< double > > res(effective_ncores);
for (int i = 0; i < effective_ncores; ++i)
res[i].resize(nreplicates[i]);
// Running the experiment ----------------------------------------------------
#pragma omp parallel shared(res, engines, nreplicates) firstprivate(dunif) default(none)
{
// Getting the thread number
auto iam = omp_get_thread_num();
for (int i = 0; i < nreplicates[iam]; i++)
res[iam][i] = dunif(engines[iam]);
}
return res;
}
/***R
ans <- parallel_rand(1000, 4, 22)
res <- microbenchmark::microbenchmark(
parallel_rand(100000, 4, 22),
parallel_rand(100000, 2, 22),
parallel_rand(100000, 1, 22)
)
print(res, unit = "relative")
*/
This is the expected outcome:
> print(res, unit = "relative")
Unit: relative
expr min lq mean median uq max neval cld
parallel_rand(1e+05, 4, 22) 1.000000 1.000000 1.000000 1.000000 1.000000 1.0000000 100 a
parallel_rand(1e+05, 2, 22) 1.164803 1.092562 1.127475 1.131114 1.226660 0.6121462 100 a
parallel_rand(1e+05, 1, 22) 2.158915 1.977000 2.185225 2.145995 2.454765 1.1397473 100 b
cc @KPDuBose @UofUEpiBio/phs7045-2023
@gvegayon @chipmanj Is this the issue where we post our final R-package?
Yes
George G. Vega Yon, Ph.D. (from mobile) https://ggvy.cl Pronouns: he/him/él
On Fri, Apr 28, 2023, 13:54 SinghRavin @.***> wrote:
@gvegayon https://github.com/gvegayon @chipmanj https://github.com/chipmanj Is this the issue where we post our final R-package?
— Reply to this email directly, view it on GitHub https://github.com/UofUEpiBio/PHS7045-advanced-programming/issues/20#issuecomment-1528020541, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAG2FM7SUFWJWH37JTWJQPDXDQN73ANCNFSM6AAAAAAXBNIQMA . You are receiving this because you were mentioned.Message ID: @.***>
Hey, @UofUEpiBio/phs7045-2023, if you are having trouble with your C++ code, I've written a little bit about how to debug your Rcpp code using Valgrind and GDB: https://book-hpc.ggvy.cl/rcpp-debugging.html
Hi @gvegayon, I added a function to my package that uses the c++ function I created. However, when I try to check and build the package, I get this error;
(function (command = NULL, args = character(), error_on_status = TRUE, …
:
! System command 'Rcmd.exe' failedBacktrace:
Exited with status 1.
Could you help me identify the problem and how to resolve it?
Thanks! Ransmond
What's the link to your repo? Are you getting the same error with GitHub Actions?
George G. Vega Yon, Ph.D. +1 (626) 381 8171 https://ggvy.cl Pronouns https://www.aclunc.org/article/frequently-asked-questions-whats-pronoun: he/him/él
On Tue, May 2, 2023 at 10:13 AM rberch @.***> wrote:
Hi @gvegayon https://github.com/gvegayon, I added a function to my package that uses the c++ function I created. However, when I try to check and build the package, I get this error; ── R CMD build ─────────────────────────────────────────── ✔ checking for file 'C:\Users\u1302088\OneDrive - University of Utah\SCHOOL WORK\PHS CLASSES\SPRING 2023\R Class\VIFpkg/DESCRIPTION' ... Warning in file.copy(pkgname, Tdir, recursive = TRUE, copy.date = TRUE) : problem copying .\VIFpkg.Rproj.user\shared\notebooks\06F85F2D-README\1\s to C:\Users\u1302088\AppData\Local\Temp\RtmpiCaXg4\Rbuild31a85c4c690\VIFpkg.Rproj.user\shared\notebooks\06F85F2D-README\1\s: Permission denied ERROR copying to build directory failed Error in (function (command = NULL, args = character(), error_on_status = TRUE, …: ! System command 'Rcmd.exe' failed Exit status: 1 stdout & stderr:
Backtrace:
- devtools::check(, check_dir = dirname(getwd()))
- withr::with_envvar(pkgbuild::compiler_flags(FALSE), …
- base::force(code)
- pkgbuild::build(pkg$path, tempdir(), args = build_ar…
- withr::with_temp_libpaths(rcmd_build_tools(options$c…
- base::force(code)
- pkgbuild::rcmd_build_tools(options$cmd, c(options$path, option…
- pkgbuild::with_build_tools({ …
- withr::with_path(rtools_path(), code)
- base::force(code)
- base::withCallingHandlers(callr::rcmd_safe(..., env = env,…
- callr::rcmd_safe(..., env = env, spinner = FALSE, sh…
- callr:::run_r(options)
- base::with(options, with_envvar(env, do.call(processx::run…
- base::with.default(options, with_envvar(env, do.call(proce…
- base::eval(substitute(expr), data, enclos = parent.frame())
- base::eval(substitute(expr), data, enclos = parent.frame())
- callr:::with_envvar(env, do.call(processx::run, c(list(bin, …
- base::force(code)
- base::do.call(processx::run, c(list(bin, args = real_cmdar…
- (function (command = NULL, args = character(), error…
- base::throw(new_process_error(res, call = sys.call(), echo…
- | base::signalCondition(cond)
- (function (e) …
- asNamespace("callr")$err$throw(e) Execution halted
Exited with status 1.
Could you help me identify the problem and how to resolve it?
Thanks! Ransmond
— Reply to this email directly, view it on GitHub https://github.com/UofUEpiBio/PHS7045-advanced-programming/issues/20#issuecomment-1531748988, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAG2FM4LS2LW3HAJMWT4ZWDXEEXDFANCNFSM6AAAAAAXBNIQMA . You are receiving this because you were mentioned.Message ID: @.***>
Dear @UofUEpiBio/phs7045-2023, the instructions for the final are now posted here (source here). Like the midterm, the final will have two parts: A presentation and an R package. The deadlines are the following:
There won't be classes during the week previous to the presentation. The extension to the R packages has to be agreed during this week (April 17th).