Closed bquistorff closed 8 years ago
That makes sense. Although it should be with a warning. I do see some reasons why using parallel
with less obs than clusters could be useful. The change should be on the parallel
function directly on the following lines:
https://github.com/gvegayon/parallel/blob/master/ado/parallel.ado#L284 https://github.com/gvegayon/parallel/blob/master/ado/parallel.ado#L313 https://github.com/gvegayon/parallel/blob/master/ado/parallel.ado#L340 https://github.com/gvegayon/parallel/blob/master/ado/parallel.ado#L358
Further, if it is been used with by()
it might be more complicated. What if N > $PLL_CLUSTERS
but after using by(var)
it turns out that N[by(var)] < $PLL_CLUSTERS
. Seems complicated
Done in 86fc862.
In the default setup (where the dataset is divided amongst the clusters) there is an error when there are fewer observations than clusters. Ideally, we'd temporarily lower the number of clusters used. Other ways of using
parallel
might suffer similar problems, I haven't checked.