drvinceknight / amwoss

Applied mathematics problems with Open Source Software: Operational Research with Python and R.
MIT License
117 stars 10 forks source link

Slow performance of LP in R #44

Closed drvinceknight closed 3 years ago

drvinceknight commented 4 years ago

@geraintpalmer has been trying numerous things:

Update: (nothing to do, just speaking aloud here)

  • Using the functions I wrote in R, I wrote the constraints matrix, objective, rhs etc to file.
  • Read these into Python, built a pulp model (call this Model 1) from these.
  • This does solves easy.
  • Define a new pulp model (Model 2) using the same formulation, but writing it from scartch in python. This solves easy.
  • I write Model 1 and Model 2 to .lp files, check the diffs, they are identical.
  • In R, using the Rglpk package, the original model hangs if I want it binary, same with lpSolve
  • In R, I use Rglpk to read in the .lp files I created with pulp (Model 1, using the matrices I wrote in R in the first place) This hangs when I try to sovle it. The only thing I can see that's different is the solver. PuLP uses coin_api not glpk. I can't get coin_api to install in R yet. But I'm still worried as this is not a big model at all. (edited)

But, Rglpk has a verbose option. This shows that is' not actually breaking. Just the branch and bound method is takign ages, and it's continually doing stuff and updating. So we are ok. This particular implementation in R is not very efficient in comparison to Python. That's all it is.

There is a "max time" option or something, and we can talk about that and so on. But I'm not sure if that's something that looks good for the book, again saying "Python can do this is miliseconds, R takes hours".

geraintpalmer commented 4 years ago
#' Writes the constraint row dealing with clashes
#'
#' @param clashes: a vector of module indicies that all cannot be scheduled at the same time
#' @param day: an integer representing the day
#'
#' @return the constraint row corresponding to that set of clashes on that day
write_Xclash_constraint_row_lhs <- function(clashes, day){
  row_lhs_Xday <- rep(0, n_modules)
  row_lhs_Xday[clashes] = 1
  row_lhs_Xnoday <- rep(0, n_modules)
  full_lhs_X <- append(append(rep(row_lhs_Xnoday, day - 1), row_lhs_Xday), rep(row_lhs_Xnoday, n_days-day))
  full_lhs <- append(full_lhs_X, rep(0, n_days))
  full_lhs
}

#' Writes the constraint row ensure every module is scheduled once
#'
#' @param module: an integer representing the module
#'
#' @return the constraint row corresponding to that set of scheduling to module on any day
write_Xreq_constraint_row_lhs <- function(module){
  row_lhs_day <- rep(0, n_modules)
  row_lhs_day[module] = 1
  full_lhs_X <- rep(row_lhs_day, n_days)
  full_lhs <- append(full_lhs_X, rep(0, n_days))
  full_lhs
}

#' Writes the constraint row representing the Y variable, whether at least one exam is scheduled on that day
#'
#' @param day: an integer representing the day
#'
#' @return the constraint row corresponding to creating Y
write_Y_constraint_row_lhs <- function(day){
  row_lhs_Xday <- rep(1, n_modules)
  row_lhs_Xnoday <- rep(0, n_modules)
  full_lhs_X <- append(append(rep(row_lhs_Xnoday, day - 1), row_lhs_Xday), rep(row_lhs_Xnoday, n_days-day))
  row_lhs_Y <- rep(0, n_days)
  row_lhs_Y[day] = -n_modules
  full_lhs <- append(full_lhs_X, row_lhs_Y)
  full_lhs
}

#' Writes all the constraints as a matrix, column of inequalities, and RHS column.
#'
#' @param list_of_clashes: a list of vectors corresponding to clashes, vectors of module indicies that all cannot be scheduled at the same time
#'
#' @return f.con the LHS of the constraints as a matrix
#' @return f.dir a vector of directions of the inequalities for each row of the constraints matrix
#' @return f.rhs a vector of LHS of the inequalities for each row of the constraints matrix
write_constraints <- function(list_of_clashes){
  all_rows <- c()
  all_dirs <- c()
  all_rhss <- c()
  n_rows <- 0

  for (clash in list_of_clashes){
    for (day in 1:n_days){
      row_lhs <- write_Xclash_constraint_row_lhs(clash, day)
      all_rows <- append(all_rows, row_lhs)

      all_dirs <- append(all_dirs, "<=")
      all_rhss <- append(all_rhss, 1)
      n_rows <- n_rows + 1
    }
  }

  for (module in 1:n_modules){
    row_lhs <- write_Xreq_constraint_row_lhs(module)
    all_rows <- append(all_rows, row_lhs)

    all_dirs <- append(all_dirs, "==")
    all_rhss <- append(all_rhss, 1)
    n_rows <- n_rows + 1
  }

  for (day in 1:n_days){
    row_lhs <- write_Y_constraint_row_lhs(day)
    all_rows <- append(all_rows, row_lhs)

    all_dirs <- append(all_dirs, "<=")
    all_rhss <- append(all_rhss, 0)
    n_rows <- n_rows + 1
  }

  f.con <- matrix(all_rows, nrow = n_rows, byrow = TRUE)
  f.dir <- all_dirs
  f.rhs <- all_rhss
  list(f.con, f.dir, f.rhs)
}

#' Writes the constraint row dealing with clashes
#'
#' @param n_modules: the number of modules to schedule
#' @param n_days: the maximum number of days to schedule (same as n_modules in worst case scenario)
#'
#' @return the objective function row to minimise
write_objective <- function(n_modules, n_days){
  obj <- c()
  num_zeros <- n_modules * n_days
  for (var in 1:num_zeros){
    obj <- append(obj, 0)
  }
  for (day in 1:n_days){
    obj <- append(obj, 1)
  }
  obj
}

# Now define the parameters, solve the problem
library(Rglpk)

n_modules = 26
n_days = 26

Ac <- c(1, 2, 3)
Ao <- c(4, 5, 6, 7, 8)
Bc <- c(9, 10, 11, 12, 13)
Bo <- c(14, 15, 16)
Cc <- c(17, 18, 19, 20)
Co <- c(21, 22)
Dc <- c(23, 24)
Do <- c(25, 26)

list_of_clashes <- list(
  c(Ac, Ao, Dc),
  c(Bc, Bo, Co),
  c(Bc, Bo, Ac),
  c(Cc, Co, Bo),
  c(Dc, Do)
)

constraints <- write_constraints(list_of_clashes = list_of_clashes)
f.con <- constraints[[1]]
f.dir <- constraints[[2]]
f.rhs <- constraints[[3]]
f.obj <- write_objective(n_modules = n_modules, n_days = n_days)

sol <- Rglpk_solve_LP(f.obj, f.con, f.dir, f.rhs, types='B', control = list("verbose"=TRUE, "presolve"=TRUE, 'tm_limit'=3*60000))

for (module in 1:n_modules){
  for (day in 1:n_days){
    var <- ((day - 1) * n_modules) + module
    if (sol$solution[var] == 1){
      print(c(module, day))
    }
  }
}
geraintpalmer commented 4 years ago

This runs, and gives an feasible solution (I abirarily set the time limit to 3 minutes, but its solved to the same objective value in 10 seconds or so)

But we do not know this is optimal.

It "is" optimal, because it gives the same objective function value are PuLP. But in isolation, we do not know this is optimal.

drvinceknight commented 4 years ago

Progress :+1:

geraintpalmer commented 3 years ago

This is resolved by using ROI.