Closed biometrician closed 1 year ago
Hi Gregor,
especially for print
and plot
we have to improve the running time.
According to Rok, the bottleneck is summary
which prepares abe.resampling.objects
. Is it possible to speed this up?
I assume it would help, if these things are done once directly in abe.resampling
. Then if print
and plot
is repeatedly requested this is already done. If done in abe.resampling
then we could maybe do it in the foreach loop which would also help.
If saving all results in a matrix instead of the long list improves running time, which I assume, then for the option save.out="minimal"
do we still need the list? Or is it enough to just have the information stored in the matrix. If save.out="complete"
then we would have the long list plus the matrix.
print
, plot
and the other helper functions have to be adapted to get the data from the matrix instead of the list.
Since this is a larger change, can you do this change in a separate branch. Then e.g. we can still directly compare running time and size of objects.
Thanks.
I changed the output of the abe.resampling function. If save.out = "minimal", only a matrix of coefficient values is returned. If save.out = "complete", the model objects are also returned. Next week, I will rewrite the summary and print functions based on this matrix. This should be considerably faster.
I completely rewrote the summary and print functions. I also made some major changes to plot.abe and pie.abe. They should all be considerably faster now. Since this is a pretty big change, could you check if all of your existing code still works with these changes?
Thanks a lot, Gregor. I hope I have some time next week to check everything.
Gregor, I really appreciate the effort you are putting into this, thanks! I will run a few examples next week.
The size of
abe.resampling_objects
is much more reasonable with the newsave.out = "minimal"
. This really helps a lot. However, we have still the problem, that calls toplot.abe()
orprint.abe()
may take a considerable time for someabe.resampling_objects
. I assume the main bottleneck is, that in each call the entire list is worked through. So my suggestion would be, that in addition to the list, in theabe.resampling_objects
a matrix with the coefficients in each resample is saved, i.e. a num.resamples times number of variables +1 for the intercept with the coefficients is saved. Maybe if it increases computation time a 0/1 matrix with included variables, as well. This matrix can be generated once at the end ofabe.resampling()
. Then future calls toplot.abe()
orprint.abe()
can simply use this preprocessed matrix. It means to work through to all help functions, but I think the work load is not too much. I talked with Gregor, if you think this is okay, he could do it.