Closed rbecerril closed 6 years ago
Hi,
it is difficult to say what's happening. Your case is too complicated to follow and I don't have time to do it. I may eventually look at a minimal reproducible example (data+code https://en.wikipedia.org/wiki/Minimal_Working_Example).
First I would look at the following:
"recently I started experiencing problems with it." Do you mean that previously the same exact code + data was working? Can you prove that?
then you wrote "I changed my data and […]". What if you change again the data? And again?
try to call the fitness function with random solutions and see if the function returns what you expect.
run the code not in parallel to see if the problem is in the parallelization.
The main idea of the above step is to isolate each step to make sure the problem is not in the above part.
All the best,
Luca
On 5 May 2018, at 20:54, rbecerril notifications@github.com wrote:
Hi Luca,
First of all, thank you for sharing and maintaining this very useful package. Second, recently I started experiencing problems with it. It used to work fine but then I changed my data and now I am getting this message:
Error in { : task 31 failed - "replacement has length zero" Calls: ga -> %DO% ->
Execution halted I am using GA version 3.0.2, on R version 3.4.1 (2017-06-30) -- "Single Candle" running it on 15 cores of a computing cluster with platform: x86_64-pc-linux-gnu (64-bit)
The call in my program is
pdf(file = sprintf("gamonitor%s.pdf", filesuff)) assortmentOpt = ga(type="binary", fitness=expected_profit, assortmentmap=assortmentmap, N0l=N, M0l=M, Nl=Nfocal, Ml=Mfocal, Kl=K, Dl=D, maxnKl=maxnK, cindl=cindfocal, tauVl=tauVfocal, thetaVl=thetaVfocal, Xl=Xfocal, marginsVl=focaldata$margins, suggestions=initialConds, nBits = nrow(assortmentbase), names=assortmentbase$key, pmutation = 0.25, pcrossover = 0.75, seed = 1234, maxiter = maxOptiter, parallel=runparallel, monitor=plot) dev.off()
the fitness function is
compute profit for the entire chain
expected_profit = function(assortment, assortmentmap, N0l, M0l, Nl, Ml, Kl, Dl, maxnKl, cindl, tauVl, thetaVl, Xl, marginsVl ) {
compute profit specific to a draw in the chain of estimates
draw_profit = function(Nl, Ml, Kl, nKl, cindl, aindl, taul, thetal, Xl, marginsl ) {
expu = matrix(NA, nrow = Nl, ncol = Kl) expuMargin = matrix(NA, nrow = Nl, ncol = Kl) # compute utilities for available alternatives for (n in 1:Nl){ for (k in 1:nKl[n]){ expu[n,k] <- exp( taul[cindl[n],aindl[n,k]] + t(Xl[n,aindl[n,k],]) %*% thetal[cindl[n],] + logiterrors[n,k]) expuMargin[n,k] = expu[n,k] * marginsl[n,aindl[n,k]] # if price coeff is negative, omit observation if (thetal[cindl[n],1]<0) expuMargin[n,k]=0 } } expectedProfit = sum(apply(expuMargin,1,sum, na.rm=TRUE) / apply(expu,1,sum, na.rm=TRUE)) return(expectedProfit)
}
######################################
RECONSTRUCT DATA STRUCTURES
######################################
expand gene to span all transactions, rows are observatinos, columns are alternatives
tmp = matrix(assortment[assortmentmap], ncol=Kl, nrow=Nl, byrow=TRUE) maxnKl = max(apply(tmp,1,function(x) sum(!is.na(x)))) aindl = matrix(0, ncol=maxnKl, nrow=Nl) nKl = numeric(Nl) if (sum(tmp==0)>0) {
contruct list of available alternatives for each observation
for (n in 1:Nl) { tmp2 = which(tmp[n,]==1) aindl[n, 1:length(tmp2)] = tmp2 nKl[n] = length(tmp2) }
} else { aindl = matrix(rep(1:Kl,Nl), ncol = Kl, nrow=Nl, byrow=TRUE) nKl = rep(K,Nl) }
marginsl = array(NA,c(Nl,Kl)) for (n in 1:Nl){ indices = ((n-1)Kl+1):(nKl) marginsl[n,] = marginsVl[indices] }
######################################
COMPUTE PROFITS
###################################### profit = numeric(usedraws) for (draw in 1:usedraws){ thetal = matrix(as.numeric(thetaVl[draw,]), nrow = Ml, ncol=Dl, byrow=FALSE) taul= matrix(as.numeric(tauVl[draw,]), nrow=Ml, ncol=Kl, byrow=FALSE) profit[draw] = draw_profit(Nl, Ml, Kl, nKl, cindl, aindl, taul, thetal, Xl, marginsl ) }
tmp = sum(profit, na.rm=TRUE) if (length(tmp)==0 | is.na(tmp) | is.nan(tmp) | is.infinite(tmp)) return(0) else return(tmp)
}So I am avoiding invalid values for the fitness function.
If you could offer any insight on what may be happening, I would greatly appreciate it.
Thanks in advance
Rafael
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.
Luca,
Thanks for the prompt reply and for the recommendations. I was hoping you may have seen something similar before, but the suggestions are very useful anyway. Just to follow up on your questions:
-First I would look at the following: - "recently I started experiencing problems with it." Do you mean that previously the same exact code + data was working? Can you prove that?
Great suggestion. Will test with the old dataset again.
- then you wrote "I changed my data and […]". What if you change again the data? And again?
I am using two different datasets, and the algorithm breaks down for both. Maybe these datasets are particularly different from previous datasets.
- try to call the fitness function with random solutions and see if the function returns what you expect.
will try this after other tests
- run the code not in parallel to see if the problem is in the parallelization.
Good idea. I am already working on that.
Thanks again. I'll post back once I figure out what the issue is.
Rafael
Luca,
I debugged the code serially on a Windows machine and this way was able to get more detailed debugging information. For some reason the error information on the linux server was not very informative. It turned out to be a problem with the fitness function, nothing wrong with GA.
Thanks again for the recommendations.
Rafael
Hi Luca,
First of all, thank you for sharing and maintaining this very useful package. Second, recently I started experiencing problems with it. It used to work fine but then I changed my data. Now, after a number of iterations, (see this file ga_monitor__M130_m18q1_2.pdf ), I get this message:
I am using GA version 3.0.2, on R version 3.4.1 (2017-06-30) -- "Single Candle" running it on 15 cores of a computing cluster with platform: x86_64-pc-linux-gnu (64-bit)
The call in my program is
the fitness function is
So I am avoiding invalid values for the fitness function.
If you could offer any insight on what may be happening, I would greatly appreciate it.
Thanks in advance
Rafael