dleutnant / swmmr

R Interface for US EPA's SWMM
https://cran.r-project.org/package=swmmr
39 stars 15 forks source link

How can i consider initial parameter value in SWMMR DEoptim packages #75

Closed boram1024 closed 2 years ago

boram1024 commented 2 years ago

Hello, I'm a beginner in R and Autocalibration. As shown in "https://cran.r-project.org/web/packages/swmmr/vignettes/How_to_autocalibrate_a_SWMM_model_with_swmmr.html" I just follow up the code. In here, I just wondering how can I consider initial parameter value in SWMMR and DEoptm code. For example, SCE-UA(in hydromad package) is consider initial parameter value. Is it possible in DEoptim like SCE-UA? How can i run this?

The code is below,

`#install.packages(c("zoo", "latticeExtra", "polynom", "car", "Hmisc","reshape"))

install.packages("hydromad", repos="http://hydromad.catchment.org")

install.packages("sf")

install.packages("swmmr") #install packages

install.packages("DEoptim")

install.packages("readxl")

install.packages("xts")

install.packages("openxlsx")

library(package = "hydromad")

library(sf)

library(swmmr) #Load package library(DEoptim) library(openxlsx) library(readxl) library(xts) setwd("C:/Users/USER/Downloads/optimization/test_sh") #Assign working directory

getwd() #Check working directory

inp_file<-file.path("C:/Users/USER/Downloads/optimization/test_sh/samho2.inp") #Read initial input file

swmm_file<-run_swmm(inp_file)

obs<-read_xlsx("obs2.xlsx")

obs2<-xts(obs$V1,order.by=obs$...1)

write.csv(as.data.frame(obs),"obs.csv")

inp<-read_inp(swmm_file$inp)

nse <- function(x){1-sum((x[,1]-x[,2])^2)/sum((x[,1]-mean(x[,1]))^2)} obj_fun <- function(x, inp, obs2){

inp$subcatchments <- within(inp$subcatchments,{

Perc_Imperv<-x[1]

Width<-x[2]

})

inp$subareas<-within(inp$subareas,{ "N-Imperv"<-x[1] "N-Perv"<-x[2] "S-Imperv"<-x[3] "S-Perv"<-x[4] PctZero<-x[5] }) inp$infiltration<-within(inp$infiltration,{ MaxRate<-x[6] MinRate<-x[7] Decay<-x[8] }) inp$conduits<-within(inp$conduits,{ Roughness<-x[9] })

inp$infiltration<-within(inp$infiltration,{

DryTime<-x[10]

})

inp$subcatchments <- within(inp$subcatchments,{

Perc_Slope<-x[11]

})

tmp_inp<-file.path("C:/Users/USER/Downloads/optimization/test_sh/inp/sh2.inp")

tmp_inp<-tempfile()

write_inp(inp, tmp_inp) swmm_file<-suppressMessages(run_swmm(tmp_inp, stdout = NULL))

on.exit(file.remove(unlist(swmm_file)))

sim<-read_out(swmm_file$out, iType = 2, object_name = "1036-7500", vIndex = 0)[["1036-7500"]]$flow_rate nse(merge(obs2,sim))*-1 }

set.seed(1234)

calibration_res<-DEoptim( fn=obj_fun, lower = c(0.011,0.1,1.6,3.8,10,76,2.5,1,0.011), upper = c(0.02,0.35,3.8,6.4,30,254,25.4,4,0.02), control = list(

strategy=6,

#NP=100,
#p=1,
itermax=100,
trace=10,
packages=c("swmmr"),
parVar=c("nse"),
parallelType = 0),

inp=inp, obs=obs2 )

summary(calibration_res)`

joelnc commented 2 years ago

Hi,

The DEOptim package has an initialpop argument that you can use. https://cran.r-project.org/web/packages/DEoptim/DEoptim.pdf

So for example, after running your optimization call above, you could take the final population and feed that in as the starting population for the next run. To extract the final population from the run above, that would be:

bestPop_firstRun <- calibration_res$member$pop. DEOptim documentation breaks out how to get best single member, etc.

Then in your DEOptim control list, you would add initialpop=bestPop_firstRun as a control argument and it should start there. Just be sure to use the same formatting if you're constructing the initial population yourself rather than reusing DEOptim output.

boram1024 commented 2 years ago

Hi,

The DEOptim package has an initialpop argument that you can use. https://cran.r-project.org/web/packages/DEoptim/DEoptim.pdf

So for example, after running your optimization call above, you could take the final population and feed that in as the starting population for the next run. To extract the final population from the run above, that would be:

bestPop_firstRun <- calibration_res$member$pop. DEOptim documentation breaks out how to get best single member, etc.

Then in your DEOptim control list, you would add initialpop=bestPop_firstRun as a control argument and it should start there. Just be sure to use the same formatting if you're constructing the initial population yourself rather than reusing DEOptim output.

Thank you for your comment. Your answer will be of great help to me!!

But, I have additional question. If the variable increases significantly, the running time of the code becomes too long. Code is below; `library(swmmr) #Load package library(DEoptim) library(openxlsx) library(readxl) library(xts) setwd("C:/Users/BORAM/Desktop/R_example/optimization/samho") #Assign working directory

getwd() #Check working directory

inp_file<-file.path("C:/Users/BORAM/Desktop/R_example/optimization/samho/samho2.inp") #Read initial input file

swmm_file<-run_swmm(inp_file)

obs<-read_xlsx("obs2.xlsx")

obs2<-xts(obs$V1,order.by=obs$...1)

write.csv(as.data.frame(obs),"obs.csv")

inp<-read_inp(swmm_file$inp)

nse <- function(x){1-sum((x[,1]-x[,2])^2)/sum((x[,1]-mean(x[,1]))^2)} obj_fun <- function(x, inp, obs2){ inp$subcatchments <- within(inp$subcatchments,{ Perc_Imperv<-x[1:107] Width<-x[108:214] }) inp$subareas<-within(inp$subareas,{ "N-Imperv"<-x[215] "N-Perv"<-x[216] "S-Imperv"<-x[217] "S-Perv"<-x[218] PctZero<-x[219] }) inp$infiltration<-within(inp$infiltration,{ MaxRate<-x[220] MinRate<-x[221] Decay<-x[222] }) inp$conduits<-within(inp$conduits,{ Roughness<-x[223] })

inp$infiltration<-within(inp$infiltration,{

DryTime<-x[224]

})

inp$subcatchments <- within(inp$subcatchments,{

Perc_Slope<-x[225]

})

tmp_inp<-file.path("C:/Users/BORAM/Desktop/R_example/optimization/samho/inp/sh2.inp")

tmp_inp<-tempfile() write_inp(inp, tmp_inp) swmm_file<-suppressMessages(run_swmm(tmp_inp, stdout = NULL))

on.exit(file.remove(unlist(swmm_file)))

sim<-read_out(swmm_file$out, iType = 2, object_name = "1036-7500", vIndex = 0)[["1036-7500"]]$flow_rate nse(merge(obs2,sim))*-1 }

set.seed(1234)

calibration_res<-DEoptim( fn=obj_fun, lower=c(inp[["subcatchments"]][["Perc_Imperv"]]0.5, #Perc_Imperv inp[["subcatchments"]][["Width"]]0.5, #Perc_Imperv 0.011, #N-Imperv 0.1, #N-Perv 1.6, #S-Imperv 3.8, #S-Perv 10, #PctZero 76, #Max_R 2.5, #Min_R 1, #Decay 0.011 ), #Manning upper=c(inp[["subcatchments"]][["Perc_Imperv"]]1.5, inp[["subcatchments"]][["Width"]]1.5, 0.02, #N-Imperv 0.35, #N-Perv 3.8, #S-Imperv 6.4, #S-Perv 30, #PctZero 254, #Max_R 25.4, #Min_R 4, #Decay 0.02), #Manning

lower = c(0.011,0.1,1.6,3.8,10,76,2.5,1,0.011),

upper = c(0.02,0.35,3.8,6.4,30,254,25.4,4,0.02),

control = list(

strategy=6,

#NP=100,
#p=1,
itermax=100,
trace=10,
packages=c("swmmr"),
parVar=c("nse"),
parallelType = 0),

inp=inp, obs=obs2 )

summary(calibration_res)`

When variables are configured like the code above, the processing speed is very slow(almost not running). In this case, I wonder how to improve the processing speed(like multicore processing or GPU processing).

joelnc commented 2 years ago

That doesn't look like too many parameters. Multi-core is a bit of an art with this.

DEOptim control list takes a parallel processing argument, but your SWMM inp file also can be set to use a specific number of cores. So you need to decide how to divide things up.

I would start with setting your SWMM model to use 1 core in the .inp file (assuming you have many cores) and then set the control option in DEOptim to parallelType=1 (use all cores). That way it will at least be running multiple models concurrently (assuming your have multiple cores). But then running models standalone / outside of optimization, remember to reset the .inp file to use multiple threads for when you're just running one at a time.

I've never messed with GPU, can't advise on that.

Lastly, you can mess with your routing steps, time steps, hot-starts, etc, to speedup your base model as much as possible. Every second counts when moving on to running it thousands of times.

boram1024 commented 2 years ago

That doesn't look like too many parameters. Multi-core is a bit of an art with this.

DEOptim control list takes a parallel processing argument, but your SWMM inp file also can be set to use a specific number of cores. So you need to decide how to divide things up.

I would start with setting your SWMM model to use 1 core in the .inp file (assuming you have many cores) and then set the control option in DEOptim to parallelType=1 (use all cores). That way it will at least be running multiple models concurrently (assuming your have multiple cores). But then running models standalone / outside of optimization, remember to reset the .inp file to use multiple threads for when you're just running one at a time.

I've never messed with GPU, can't advise on that.

Lastly, you can mess with your routing steps, time steps, hot-starts, etc, to speedup your base model as much as possible. Every second counts when moving on to running it thousands of times.

Thank you so much,,,

Basically, DEoptim package serve parallel computing like your comment. But, SCE-UA(hydromad) package does not serve parallel option. In that case, How can i speed up my processing time? Code is below;; `library(swmmr) #Load package library(DEoptim) library(openxlsx) library(readxl) library(xts) library(package='hydromad') setwd("C:/Users/BORAM/Desktop/R_example/optimization/samho") #Assign working directory

getwd() #Check working directory

inp_file<-file.path("C:/Users/BORAM/Desktop/R_example/optimization/samho/samho2.inp") #Read initial input file

swmm_file<-run_swmm(inp_file)

obs<-read_xlsx("obs2.xlsx")

obs2<-xts(obs$V1,order.by=obs$...1)

write.csv(as.data.frame(obs),"obs.csv")

inp<-read_inp(swmm_file$inp)

nse <- function(x){1-sum((x[,1]-x[,2])^2)/sum((x[,1]-mean(x[,1]))^2)} obj_fun <- function(x, inp, obs2){ inp$subcatchments <- within(inp$subcatchments,{ Perc_Imperv<-x[1:107] Width<-x[108:214] }) inp$subareas<-within(inp$subareas,{ "N-Imperv"<-x[215] "N-Perv"<-x[216] "S-Imperv"<-x[217] "S-Perv"<-x[218] PctZero<-x[219] }) inp$infiltration<-within(inp$infiltration,{ MaxRate<-x[220] MinRate<-x[221] Decay<-x[222] }) inp$conduits<-within(inp$conduits,{ Roughness<-x[223] })

inp$infiltration<-within(inp$infiltration,{

DryTime<-x[224]

})

inp$subcatchments <- within(inp$subcatchments,{

Perc_Slope<-x[225]

})

tmp_inp<-file.path("C:/Users/BORAM/Desktop/R_example/optimization/samho/inp/sh2.inp")

tmp_inp<-tempfile()

write_inp(inp, tmp_inp) swmm_file<-suppressMessages(run_swmm(tmp_inp, stdout = NULL))

on.exit(file.remove(unlist(swmm_file)))

sim<-read_out(swmm_file$out, iType = 2, object_name = "1036-7500", vIndex = 0)[["1036-7500"]]$flow_rate nse(merge(obs2,sim))*-1 }

set.seed(1234)

system.time({sce<-SCEoptim( obj_fun, par=c(inp[["subcatchments"]][["Perc_Imperv"]], #Perc_Imperv inp[["subcatchments"]][["Width"]], #Perc_Imperv 0.03, #N-Imperv 0.3, #N-Perv 2.5, #S-Imperv 5.0, #S-Perv 10, #PctZero 254, #Max_R 25.4, #Min_R 2, #Decay 0.015), lower=c(inp[["subcatchments"]][["Perc_Imperv"]]0.5, #Perc_Imperv inp[["subcatchments"]][["Width"]]0.5, #Perc_Imperv 0.011, #N-Imperv 0.1, #N-Perv 1.6, #S-Imperv 3.8, #S-Perv 10, #PctZero 76, #Max_R 2.5, #Min_R 1, #Decay 0.011 ), #Manning upper=c(inp[["subcatchments"]][["Perc_Imperv"]]1.5, inp[["subcatchments"]][["Width"]]1.5, 0.02, #N-Imperv 0.35, #N-Perv 3.8, #S-Imperv 6.4, #S-Perv 30, #PctZero 254, #Max_R 25.4, #Min_R 4, #Decay 0.02), #Manning

lower = c(0.011,0.1,1.6,3.8,10,76,2.5,1,0.011),

upper = c(0.02,0.35,3.8,6.4,30,254,25.4,4,0.02),

control = list( trace=1, ncomplex=20, maxit=10), inp=inp, obs=obs2 )})

summary(sce)`

joelnc commented 2 years ago

Seems silly to run SCE and not have the work distributed among cores, but on quick review it does not appear to be trivial to set that up with the library you are using.

I would suggest closing this swmmr Issue related to starting populations if that is resolved, since SCE multi-threading is a different topic, and not even directly swmmr related.

See here to start on parallelizing: https://stackoverflow.com/questions/15397390/parallelized-optimization-in-r

There may even be other SCE implementations that can be run in parallel, but you'll have to look around on that.

boram1024 commented 2 years ago

Seems silly to run SCE and not have the work distributed among cores, but on quick review it does not appear to be trivial to set that up with the library you are using.

I would suggest closing this swmmr Issue related to starting populations if that is resolved, since SCE multi-threading is a different topic, and not even directly swmmr related.

See here to start on parallelizing: https://stackoverflow.com/questions/15397390/parallelized-optimization-in-r

There may even be other SCE implementations that can be run in parallel, but you'll have to look around on that.

You are right. Thank you so much for answering my first question. I hope that someday will come I can help you.