Closed amisepa closed 1 year ago
FYI:
c =
Local Cluster
Properties:
Profile: Processes
Modified: false
Host: laptop-i01si850
NumWorkers: 8
NumThreads: 1
RequiresOnlineLicensing: false
Associated Jobs:
Number Pending: 0
Number Queued: 0
Number Running: 0
Number Finished: 0
commenting limo_check_ppool fixes the error.
You can comment the call to limo_check_ppool in limo_random_robust and it will call your parpool with default settings, but I'm sure Cyril has a better answer. :)
On Wed, 2023-05-17 at 20:24 -0700, Cedric Cannard wrote:
Latest Matlab, EEGLAB, and both master and HotFixes branch give the following error. 1st level completed successfully. Trying to do paired t-test at 2nd level, I get this error when computing bootstrap: Error using parpool Too many workers requested. The cluster "Processes" has the NumWorkers property set to a maximum of 8 workers but 15 workers were requested. Either request a number of workers less than NumWorkers, or increase the value of the NumWorkers property for the cluster (up to a maximum of 512 for the Local cluster).
Error in limo_check_ppool (line 36) parpool(N-1);
Error in limo_random_robust (line 515) limo_check_ppool
Error in limo_random_select (line 817) tmpname = limo_random_robust(3,tmp_data1,tmp_data2,parameters,LIMO);
Error in limo_random_effect>Paired_t_test_Callback (line 219) limo_random_select('paired t- test',handles.chan_file,'nboot',handles.b,'tfce',handles.tfce,'type', 'Channels');
Error in gui_mainfcn (line 95) feval(varargin{:});
Error in limo_random_effect (line 29) gui_mainfcn(gui_State, varargin{:});
Error in matlab.graphics.internal.figfile.FigFile/read>@(hObject,eventdata)lim o_random_effect('Paired_t_test_Callback',hObject,eventdata,guidata(hO bject)) Error while evaluating UIControl Callback. Turning off parpool in MAtlab preferences or in EEGLAB preferences does not solve the issue. Thanks in advance — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>
Thanks, yes that's what I'm doing currently but wanted to report as I'm guessing this is for providing faster computation and should take into account the right number of workers and threads automatically.
something is wrong in assessing your currently running processes (1) I pushed a fix which is to closes the ppool after 1st level see this commit (2) but I'd like to have the true solution (unfortunately I cannot reproduce your error, it doesn't bug on my windows laptop or linux server)
can you try the following in the command window
limo_check_ppool % this starts the ppool the limo way
p = gcp('nocreate');
now p
should not be empty? if not empty I don't understand why it enters the if statement in limo_check_ppool.
If p
is empty, it enters the if statement and launches the pool which leads to your error since it is already in use
what do you have for N
? (your error message says 15)
N = getenv('NUMBER_OF_PROCESSORS'); % logical number of cores (i.e. count hyperthreading)
if isempty(N)
N = feature('numcores'); % physical number of cores
elseif ischar(N)
N = str2double(N);
end
I don't understand how calling the function in limo_batch doesn't fail but somehow fails calling it a second time .. N
in the limo_check_ppool should be the same, 15 workers because N should be 16 (ie you have 8 processors with 2 threads)
I think I had commented it at 1st level and forgot about it. NowI get the error with limo_batch too, so that makes more sense.
Found a solution.
To take into account the total number of processors (16 in my case), you have to use your variable 'c' like this at line 36:
p = c.parpool(N-1);
If I understand correctly, this increases the cluster profile's number of workers to 16, otherwise, it is limited to the physical cores without taking into account the threads (8 in my case).
Why using N-1 and not N by the way?
yes thx I know how this works, but you haven't answered my questions above about p
or N
, so writing a code that works for everybody while fixing your particular problem is not that easy ...
I'm assuming (1) p
is empty and therefore it enters the if statement (2) N
is correct (16 from getenv) and (3) it is calling parpool(N-1)
that fails and your proposed solution is instead to do p = c.parpool(N-1)
although i really don't see why setting up p at this point just doing c.parpool(N-1) starts the pool (changed here)
if all the above is true, it still doesn't solve the initial error you reported The cluster "Processes" has the NumWorkers property set to a maximum of 8 workers but 15 workers were requested
that N-1 is too high.
following up -- @amisepa ? have you figured how you got the error The cluster "Processes" has the NumWorkers property set to a maximum of 8 workers but 15 workers were requested
if indeed you have 16 workers
Yes, p was empty initially because limo_check_ppool was simply crashing because of the error.
I believe the problem was that you were opening with parpool(N-1)
, which sets by default the maximum number of workers in the cluster profile to the number of physical cores, excluding threads to avoid potential hyperthreading issues maybe?). So the parpool had a limit of 8 workers max by default. Now, the cluste profile is updated to 16 by using the variable c
, which takes into account the total number of processors (cores + threads).
Before it could only work for N = feature('numcores');
but not for N = getenv('NUMBER_OF_PROCESSORS');
Setting up p
with p = c.parpool(N-1);
is not necessary indeed. It's c
that's the solution. I think this solution should account for most people.
You probably no longer need to shut down parpool at the end of LIMO 1st level now, since it takes longer to open twice, and should work fine now.
doesn't make sense because if p is empty it goes to line https://github.com/LIMO-EEG-Toolbox/limo_tools/blob/HotFixes/limo_check_ppool.m#L36 and doesn't crash at it uses try - catch ; so was not empty to enter the loop causing the crash -- I guess c.parppol solves it
You didn;t have the try catch at that point when I submitted the error, it was running like this: https://github.com/LIMO-EEG-Toolbox/limo_tools/blob/master/limo_check_ppool.m Again, it was crashing line 36 because p was empty and N was above max limit.
By the way, your new catch has an error for me:
Latest Matlab, EEGLAB, and both master and HotFixes branch give the following error. 1st level completed successfully. Trying to do paired t-test at 2nd level, I get this error when computing bootstrap:
Turning off parpool in MAtlab preferences or in EEGLAB preferences does not solve the issue.
Thanks in advance