luca-scr / GA

An R package for optimization using genetic algorithms
http://luca-scr.github.io/GA/
91 stars 29 forks source link

[Help wanted] Use GA with integer #7

Closed LaCreArthur closed 7 years ago

LaCreArthur commented 7 years ago

Hello, Thanks for your package, I'm managed to use it efficiently with the type "real-valued" but I cannot find how to use it with the behavior of the "real-valued" type but with a population of integer (I have a one-variable problem that does not require floating precision). When I try the type "permutation" I have this error :

Error in sample.int(length(x), size, replace, prob) : 
  cannot take a sample larger than the population when 'replace = FALSE'

And trying to debug it, I found that, in type "permutation", it generate a matrix of popSize * (max - min) and that is why the error is triggered; but I just need a 1D integer population of size popSize.

I search for a way to generate an integer population instead of rounding the variable inside the fitness function, in order to have the best performance. My ultimate goal is to have a step variable in my function in order to generate fewer individuals.

For instance, my function search for intervals by finding a cut off, and it generate a bush of solutions greater than 53 like : 53.04, 53.69, 53.14, ... because it found the correct interval that is [54, ... ], So instead of computing all this redundant solutions, I wanted it to generate only integer solutions so it will be quicker and more accurate.

I hope I was clear and thanks in advance for your help.

luca-scr commented 7 years ago

If you have integer you may use "binary" encoding of integer values. See functions binary2decimal and decimal2binary

LaCreArthur commented 7 years ago

Ok, I'll try thank you. But depending of the input, my function may want to change the precision of the individuals, it can be integer or one-decimal ect... The goal is to optimize the computation time by not computing 53.042 and 53.041 if the probleme precision is only a one-decimal matter.

LaCreArthur commented 7 years ago

I've managed a working result with type binary and by multiplying number by 10^(decimal of the number). The computation time is almost x2 faster, thank you for the idea. I've still a confusing result : with parallel = T the computation is slower (despite my i7-4770) ... Is something wrong with that ?

luca-scr commented 7 years ago

Parallelisation is not always beneficial. If the evaluation of your fitness function is not expensive then the overhead in setting up parallelisation, communication between the master and the slaves, can be greater than a sequential evaluation.

From https://en.wikipedia.org/wiki/Parallel_computing

Not all parallelization results in speed-up. Generally, as a task is split up into more and more threads, those threads spend an ever-increasing portion of their time communicating with each other. Eventually, the overhead from communication dominates the time spent solving the problem, and further parallelization (that is, splitting the workload over even more threads) increases rather than decreases the amount of time required to finish. This is known as parallel slowdown.[19]

LaCreArthur commented 7 years ago

Yeap, that's it, thank you. This must be my small test data set that causes overhead. I will try it on some bigger ones..