As I mentioned, Matlab appeared to be hung in a state of starting the parpool. All the while, Netbatch showed 144 machines in use.
A control-C command killed the initialization of the parpool, as expected. Then the errors stated ...Failed to start parpool. However, the machines were still showing as allocated in Netbatch. Is the pool allocated or not? So, I tried to run a script and errors popped up indicating the parpool was not working properly:
Simple_Parallel_Code
Running parallel
Starting parallel pool (parpool) using the 'netbatch' profile ...
Warning: Failed to cancel the following jobs on the cluster:
Job ID: 38 Reason: nbjob: Service on host: 8 is not responding.
In cancelJobFcn (line 58)
In deleteJobFcn (line 9)
In parallel.cluster/Generic/deleteJobOrTask (line 708)
In parallel.cluster/Generic/hDestroyJob (line 485)
In parallel.internal.cluster/CJSJobMethods/destroyOneJob (line 71)
In parallel.job.CJSCommunicatingJob>@(job)CJSJobMethods.destroyOneJob(job.Parent,job,job.Support,job.SupportID) (line 100)
In parallel.job/CJSCommunicatingJob/destroyJob (line 100)
In parallel.Job>iDeleteJobs (line 1538)
In parallel.internal.cluster.hetfun (line 57)
In parallel/Job/delete (line 1335)
In parallel/Cluster/hDeleteOneJob (line 1023)
In parallel.internal.pool.AbstractInteractiveClient>iDeleteJobs (line 505)
In parallel.internal.pool/AbstractInteractiveClient/pStopLabsAndDisconnect (line 289)
In parallel.internal.pool.AbstractInteractiveClient>iCleanupIfStartupFailed (line 575)
In parallel.internal.pool.AbstractInteractiveClient>@()iCleanupIfStartupFailed(obj) (line 96)
In parallel.internal.general/DisarmableOncleanup/delete (line 25)
In parallel.internal.pool/AbstractInteractiveClient/start (line 77)
In parallel.internal.pool.AbstractClusterPool>iStartClient (line 816)
In parallel.internal.pool/AbstractClusterPool/hBuildPool (line 582)
In parallel.internal.pool.doParpool (line 22)
In parpool (line 128)
In parallel.internal.pool/PoolArrayManager/getOrAutoCreateWithCleanup (line 58)
In pctTryCreatePoolIfNecessary (line 28)
In parallel_function (line 418)
In Simple_Parallel_Code (line 15)
Hovering the mouse of the parpool icon at the bottom left of the Matlab app shows "Failed to start the parallel pool".
HOWEVER, it seems that the script may have actually run. This needs to be verified. So, is it possible to actually have the pool started and functioning but Matlab think that it's not?
As I mentioned, Matlab appeared to be hung in a state of starting the parpool. All the while, Netbatch showed 144 machines in use.
A control-C command killed the initialization of the parpool, as expected. Then the errors stated ...Failed to start parpool. However, the machines were still showing as allocated in Netbatch. Is the pool allocated or not? So, I tried to run a script and errors popped up indicating the parpool was not working properly:
Hovering the mouse of the parpool icon at the bottom left of the Matlab app shows "Failed to start the parallel pool".
HOWEVER, it seems that the script may have actually run. This needs to be verified. So, is it possible to actually have the pool started and functioning but Matlab think that it's not?