Open andrewrech opened 2 years ago
Hi Andrew, Thanks a lot for sharing. Looks like a good and concise improvement to me (on a first quick look).
I'm also tagging @jiajic and @mattobny here so that they can also take a look, but feel free to do a PR. We like to give people credit for their work.
We've also ran into some future
issues, but at the same time we like the premise that in theory it can be run on any machine or configuration. Perhaps we should extend lapply_flex
and have mclapply has one of the options. This is also the one that I always used, but it's not compatible with Windows afaik.
Ruben
Agree on all counts re: future
. Yes, no mclapply
on Windows, unfortunately. Unclear to me how many people will hit this issue of a need for efficient permutation testing. You could always supply a multi-arch Docker container and stop caring about platforms :-)
From a user perspective, mclapply backend in lapply_flex
sounds good (that is what I did on my fork once I realized lapply_flex
existed).
@jiajic @mattobny the trick above is to build the (sub) data.tables on the fork and then use data.table::rbindlist
to avoid extra allocations back on master. data.table
handles very long tables with aplomb so this works alright, whereas outside data.table
, multi-GB vectors were slow to work with.
That is all I changed, no need for credit (but thanks), please feel free to copy from here if helpful.
I've tested the current implementation with the proposed parallelized version and time gains seems to depend on your hardware and dataset. I believe that the best way forward would be to make a combination of both functions such that you would be able to optimally divide the number of simulations per core.
Hi there,
I needed a fast implementation of
make_simulated_network
for my work. The below is a pure-data.table
approach that is 3 or 4 log faster on my 120 core system withO(n)
memory usage.Your implementation of
future
does not work for me so I've usedmclapply
here, but you could edit as needed without major differences in performance.Happy to craft to PR if interest
Thanks
Andrew