luca-scr / GA

An R package for optimization using genetic algorithms
http://luca-scr.github.io/GA/
91 stars 29 forks source link

Miscellaneous Plotting Errors with the "GA" Library in R (Genetic Algorithm) #54

Closed swaheera closed 3 years ago

swaheera commented 3 years ago

I am working with R. I am following this tutorial (https://cran.r-project.org/web/packages/GA/vignettes/GA.html) and am learning how to optimize functions using the "genetic algorithm".

The entire process is illustrated in the code below:

Part 1: Generate some sample data ("train_data")

Part 2: Define the "fitness function" : the objective of my problem is to generate 7 random numbers :

Part 3: The purpose of the "genetic algorithm" is to find the set of these 7 numbers that produce the largest value of the "total".

Below, I illustrate this entire process :

Part 1

#load libraries
library(dplyr)
library(GA)

# create some data for this example
a1 = rnorm(1000,100,10)
b1 = rnorm(1000,100,5)
c1 = sample.int(1000, 1000, replace = TRUE)
train_data = data.frame(a1,b1,c1)

Part 2

#define fitness function
fitness <- function(random_1, random_2, random_3, random_4, split_1, split_2, split_3) {

    #bin data according to random criteria
    train_data <- train_data %>% mutate(cat = ifelse(a1 <= random_1 & b1 <= random_3, "a", ifelse(a1 <= random_2 & b1 <= random_4, "b", "c")))

    train_data$cat = as.factor(train_data$cat)

    #new splits
    a_table = train_data %>%
        filter(cat == "a") %>%
        select(a1, b1, c1, cat)

    b_table = train_data %>%
        filter(cat == "b") %>%
        select(a1, b1, c1, cat)

    c_table = train_data %>%
        filter(cat == "c") %>%
        select(a1, b1, c1, cat)

    #calculate  quantile ("quant") for each bin

    table_a = data.frame(a_table%>% group_by(cat) %>%
                             mutate(quant = quantile(c1, prob = split_1)))

    table_b = data.frame(b_table%>% group_by(cat) %>%
                             mutate(quant = quantile(c1, prob = split_2)))

    table_c = data.frame(c_table%>% group_by(cat) %>%
                             mutate(quant = quantile(c1, prob = split_3)))

    #create a new variable ("diff") that measures if the quantile is bigger tha the value of "c1"
    table_a$diff = ifelse(table_a$quant > table_a$c1,1,0)
    table_b$diff = ifelse(table_b$quant > table_b$c1,1,0)
    table_c$diff = ifelse(table_c$quant > table_c$c1,1,0)

    #group all tables

    final_table = rbind(table_a, table_b, table_c)
# calculate the total mean : this is what needs to be optimized
    mean = mean(final_table$diff)

}

Part 3


#run the genetic algorithm (20 times to keep it short):
GA <- ga(type = "real-valued", 
         fitness = function(x)  fitness(x[1], x[2], x[3], x[4], x[5], x[6], x[7]),
         lower = c(80, 80, 80, 80, 0,0,0), upper = c(120, 120, 120, 120, 1,1,1), 
         popSize = 50, maxiter = 20, run = 20)

The above code (Part 1, Part 2, Part 3) all work fine.

Problem: Now, I am trying to produce some the of the visual plots from the tutorial:

First Plot - This Works:

plot(GA)

But I can't seem to produce the other plots from the tutorial:

Second Plot: Does Not Work

lbound <- 80
ubound <- 120

curve(fitness, from = lbound, to = ubound, n = 1000)
points(GA@solution, GA@fitnessValue, col = 2, pch = 19)

 Error: Problem with `mutate()` column `cat`.
i `cat = ifelse(...)`.
x argument "random_3" is missing, with no default
Run `rlang::last_error()` to see where the error occurred. 

Error in xy.coords(x, y) : 'x' and 'y' lengths differ

Third Plot : Does Not Work

random_1 <- random_2 <- seq(80, 120, by = 0.1)
f <- outer(x1, x2, fitness)
persp3D(x1, x2, fitness, theta = 50, phi = 20, col.palette = bl2gr.colors)

Error: Problem with `mutate()` column `cat`.
i `cat = ifelse(...)`.
x argument "random_3" is missing, with no default
Run `rlang::last_error()` to see where the error occurred.
In addition: Warning message:
 Error: Problem with `mutate()` column `cat`.
i `cat = ifelse(...)`.
x argument "random_3" is missing, with no default
Run `rlang::last_error()` to see where the error occurred. 

Error in z[-1, -1] : object of type 'closure' is not subsettable

Fourth Plot: Does Not Work

filled.contour(random_1, random_2, fitness, color.palette = bl2gr.colors)

Error in min(x, na.rm = na.rm) : invalid 'type' (list) of argument

Can someone please show me how to fix these errors?

Thanks

luca-scr commented 3 years ago

The issue is that you try to adapt R code for a one-dimensional and two-dimensional problems to your case, which is a 7-dimensional problem. Simply you can't do that and when you try R will give errors. For instance, when you use

curve(fitness, from = lbound, to = ubound, n = 1000)

you are trying to plot the one-dimensional function fitness from lbound to ubound on a regular grid of n points. But fitness has 7 input arguments. How you can do that?