Closed xhdong-umd closed 6 years ago
With overlap, if you calculate the AKDE UDs beforehand, then that part takes all of the time. I can update the help file example as you suggest.
As far as parallelizing akde()
, it shouldn't be parallelized by default because of RAM issues. People sometimes run out of RAM with akde
and occurrence
even without parallelization (and have to lower their resolutions). But it could be parallelized optionally with an argument mc.cores=1
or something. I have been parallelizing some functions with mclapply
, as it has very low overhead (though it requires UNIX). I should look at your at your parallel wrapper again and we can discuss at the next ABI meeting.
I didn't use parallel mode on akde in the app, because I want to calculate all the animals in one akde
call so they will be on same grid. I did use parallel mode on occurrence
.
The parallel wrapper used mclapply
in linux/Mac and parLapplyLB
(a socket cluster mode with more overheads) in windows, also generated some default parameters so user don't need to know about it.
Looking at par_lapply
(if I am reading it right), there needs to be an extra mc.cores
-like argument for a scientific computing context where jobs might run on a cluster where you don't know what nodes you are going to get (and how many cores they have), but you do know how many processing units you are allotted. The reserved_cores
argument could then kick in if the mc.cores
argument is left undefined/blank.
The actual cluster size/mc.cores
parameter is calculated with some heuristics based on some of my experimentations. I wanted users to use it as simple as possible and not to worry about the details.
If the input list length is n
, and the cores available is m
, parallel functions will create a cluster with cluster_size
threads, and run n
jobs in that cluster.
cluster_size
to be
reserved_cores
will override them if provided.For the reason of setting cluster_size
to list length:
@chfleming, Reading your comment again, now I understand what you means. So sometimes the code can be run in a cluster where only a subset of all cores are available to user, while detect_cores
only report total cores in machine. We will definitely need a cores
parameter for this.
Following up on our discussion, I think separating plapply
into two functions: one to parallelize given a fixed core/threads argument and another to detect hyperthread count would be ideal for me. I would then want to merge this with my own mclapply
and detectCores
wrappers that I use for safe mclapply
without Windows complaining. I will put in an argument to switch between cases where overhead undesirable and vanilla lapply
is used on Windows and cases where overhead is insignificant and your socket code is used on Windows. That will cover all of my use cases internally.
I found it's really difficult to extract the core detection part as a function, because it involved platform, input list size, reserved core value etc. To abstract them out of the function will need to transfer all the parameters in and out, and there are some if/else still cannot be saved.
So I want to use cores = NULL
, and call my core detection code with the default NULL
value.
This should not interfere with your detectCores
wrappers, because you can just assign the cores
parameter with your own function, as long as it generate any positive integer it will be used directly.
I'm also thinking to enable negative values, and -m
means to reserve m
cores for user, so there is no need for an additional parameter.
And I have a parallel = TRUE
parameter already, which can use lapply
when given parallel = FALSE
.
We can also make the function to use lapply
when cores = 1
, but I think it's better to have a explicit control parameter.
I have updated the function to implement the cores
parameter.
cores: the core count to be used for cluster. Could be a positive integer or
So you can call the function with cores = 4
, cores = -1
or cores = your own function()
, parallel = FALSE
etc.
One special requirement for the function is that you need to align_list
for functions with more than 1 parameters.
Will this serve your need?
That sounds good. I will work this into ctmm and have you critique how I mangle it.
I forgot to mention the function used crayon
package for colorful console messages. You can either replace it with regular message, or import crayon
if you like it.
I don't want to have messages on the command line here without a trace option passed, but I do need to go through the package and use crayon
to differentiate various messages and warnings.
OK, we can add an parameter to control the messages like msg=TRUE
.
Don't worry about it, I have completely different needs for the command line package and am restructuring for that anyhow. The webapp needs to print out basically everything and I understand that.
OK, since we are making individual copies in each package, so maybe it's actually easier for us just maintain different versions, and only sync some core parts if needed.
Alright, I incorporated the basic code here: https://github.com/ctmm-initiative/ctmm/blob/master/R/parallel.R and tested in Windows. I will test in Linux tomorrow.
@NoonanM The mc.cores
arguments are now all changed to just cores
to be more general.
Looks good to me. The core code is just about the cluster and environment setup, and we have different needs on cores count or default mode.
I think current setup is ideal, we can keep different versions, and just share the core code like the cluster/environment part.
@chfleming , do you need to integrate other functions into ctmm
? We talked about the group plot function of variogram, though I think you said user can just use the ctmmweb
version?
If there is no more changes needed, I'll update the package website to reflect recent updates.
Yeah, if you're going to be updating those functions, then I think its best that users have the latest versions. Parallelization was the only thing that I needed internally.
OK. I'll update the website, also I'm looking at possible points that can be included in the paper.
@chfleming I tried to add parallel option for overlap since I thought it need to calculation many combinations of animal and they should be independent from each other.
I added a branch
parallel_overlap
, made the calculation parallel by animal combinations. However I'm not seeing speed improvement with bigger data set and combinations.Only after profiling the function I realized the major time consuming part is that overlap calculate the akde of telemetry objects.
I'm wondering if the example in overlap help can be added with these lines, since user may just following the example and didn't realize sometimes they should use existing home range objects if available.
So the actual overlap calculation never take much time, and there is no need to parallelize it, right?
Though I'm wondering if you want to integrate the generic parallel functions into ctmm. At least the often used
FITS <- lapply(1:2, function(i) ctmm.fit(buffalo[[i]],GUESS[[i]]) )
can be parallelized.I noticed the previously export
par_fit_tele
is actually usingctmm.select
, notctmm.fit
. I renamed it intopar_try_models
, and wrote a newpar_fit_models
as the parallel version ofctmm.fit
. Thus the line above can be written aspar_fit_models(buffalo[1:2])
.If you feel it's useful to use parallel in more places inside ctmm, we can move the parallel functions into ctmm so it doesn't depend on ctmmweb, since the parallel functions only require
parallel
package.