In the manual for parallel, how to parallelize a loop is covered in Section 3.2, where the following example code is given:
local n_proc = <number set by user>
save currdata.dta, replace
drop _all
set obs `num_total´
generate long i = _n
if `n_proc´>1 {
parallel initialize `n_proc´
parallel: parfor_task
}
else {
parfor_task
}
program parfor_task
local num_task = _N
mkmat i, matrix(tasks_i)
use currdata.dta, clear
forvalues j=1/`=_N´ {
local i = tasks_i[`j´,1]
// work for i
}
// put output into main data
end
I believe this code is wrong. In parfor_task, we loop from 1 to =_N which is supposed to be equal to the number of tasks we are looping over. However, because currdata is loaded, =_N actually gives the number of rows in currdata, which is not at all the same thing. I believe this is a typo, and it should instead read
forvalues j=1/`num_task' {
I suspect that's why num_task was defined - it is not used in the example code otherwise. I spent a few days in headache because of this, so hopefully this change helps someone :)
In the manual for parallel, how to parallelize a loop is covered in Section 3.2, where the following example code is given:
I believe this code is wrong. In
parfor_task
, we loop from 1 to=_N
which is supposed to be equal to the number of tasks we are looping over. However, becausecurrdata
is loaded,=_N
actually gives the number of rows incurrdata
, which is not at all the same thing. I believe this is a typo, and it should instead readI suspect that's why
num_task
was defined - it is not used in the example code otherwise. I spent a few days in headache because of this, so hopefully this change helps someone :)