I am running the following code:
cv <- cv.setup(train_design, train_y,
score = mean_se,
num_folds = 2,
num_iter = 1)
(train_design is a sparse matrix with around 250 columns
My total dataset has 9m rows and I am experimenting with how the chosen hyper-parameters change as I allow more data to be used.
The code works fine when there are 100,000, or 200,000 or 400,000 rows.
However when I increase the number of rows to 800,000 rows I get the following error
Error in readpipe(py2r) : Optunity error: broken pipe.
It is not my machine running out of memory ( I checked that).
I did notice even for smaller runs that cv.setup was taking a minute or two to return.
I wonder if it is just timing out due to the delay?
Is there anything I can do to avoid the broken pipe error?
Many thanks
Alan Chalk
I am running the following code: cv <- cv.setup(train_design, train_y, score = mean_se, num_folds = 2, num_iter = 1)
(train_design is a sparse matrix with around 250 columns My total dataset has 9m rows and I am experimenting with how the chosen hyper-parameters change as I allow more data to be used. The code works fine when there are 100,000, or 200,000 or 400,000 rows. However when I increase the number of rows to 800,000 rows I get the following error Error in readpipe(py2r) : Optunity error: broken pipe. It is not my machine running out of memory ( I checked that). I did notice even for smaller runs that cv.setup was taking a minute or two to return. I wonder if it is just timing out due to the delay? Is there anything I can do to avoid the broken pipe error? Many thanks Alan Chalk