Open tnaake opened 3 years ago
Hi Thomas,
thanks. I'll take a look right away.
I am a bit confused, because first you say
I had another look at the function fit_parameters_loop and improved slightly the performance of the function
but in the end
these updates [...] result in the considerable speed improvements.
Both times, you talk about the benchmark (2 seconds / 5% speed-up), right?
However, there is a problem, which doesn't allow to compare directly if the two versions yield the same result, see [...] I guess this is doe to some random number generation!
I am a bit concerned about this, can you investigate this a bit more and quantify how much the results change?
I'll also make a few inline comments about some specific things.
Best, Constantin
I have found another bunch of code that could be slightly optimised:
res_init <- lapply(seq_len(nrow(Y)), function(i){
pd_lm.fit(Y_compl[i, ], model_matrix,
dropout_curve_position = rep(NA, n_samples),
dropout_curve_scale =rep(NA, n_samples),
verbose=verbose)
})
could become:
rep_NA <- rep(NA, n_samples)
res_init <- lapply(seq_len(nrow(Y)), function(i){
pd_lm.fit(Y_compl[i, ], model_matrix,
dropout_curve_position = rep_NA,
dropout_curve_scale =rep_NA,
verbose=verbose)
})
microbenchmark the change results in slight decrease:
microbenchmark::microbenchmark(
res_init = {
rep_NA = rep(NA, n_samples);
lapply(split_Y_compl, function(i){
pd_lm.fit(i, model_matrix,
dropout_curve_position = rep_NA,
dropout_curve_scale = rep_NA,
verbose=verbose)
})},
res_init2 = lapply(split_Y_compl, function(i){
pd_lm.fit(i, model_matrix,
dropout_curve_position = rep(NA, n_samples),
dropout_curve_scale = rep(NA, n_samples),
verbose=verbose)
})
)
Unit: seconds
expr min lq mean median uq max neval cld
res_init 2.490822 2.743049 2.980568 2.896272 3.209700 4.115103 100 a
res_init2 2.519873 2.766047 3.080926 2.958154 3.381294 4.317780 100 b
Hi @const-ae
following the same logic of pull request #8 I had another look at the function
fit_parameters_loop
and improved slightly the performance of the function. It is now slightly faster than the original version (see below, I hope it is worth checking).However, there is a problem, which doesn't allow to compare directly if the two versions yield the same result, see:
I guess this is doe to some random number generation!
For now, see
bench::mark
withcheck=FALSE
:Short script to generate the input variables:
Surprisingly, these updates (I guess most comes from the
lapply
changes result in the considerable speed improvements.