MLopez-Ibanez / irace

Iterated Racing for Automatic Algorithm Configuration
https://mlopez-ibanez.github.io/irace/
GNU General Public License v2.0
58 stars 14 forks source link

Add `multi_irace` function #60

Closed Saethox closed 1 year ago

Saethox commented 1 year ago

This is only the sequential version, so I'm making this a draft for now.

It is my first time writing R, so please tell me if there are more idiomatic ways to do something. I'm also not quite sure how the building process works in R. I did run devtools::document() and commit the generated files. Is this already sufficient?

For the parallel version, I'm thinking of adding parallel and njobs parameters.

MLopez-Ibanez commented 1 year ago

If you use Rstudio, there are buttons and menus to do everything. From the command-line, you can use devtools: https://r-pkgs.org/whole-game.html If you are in Linux, there is a Makefile so you can do:

make build
make check
make install

Maybe the Makefile also works in Mac?

MLopez-Ibanez commented 1 year ago

Does it really work? You may wish to create a test that runs this function (see tests/)

Saethox commented 1 year ago

Thanks for your review!

I had some trouble building on Windows, which seems to be some trouble with the mingw build bundled in the rtools43. Do you have some experience with building on Windows?

> devtools::load_all()
ℹ Loading irace
ℹ Re-compiling irace (debug build)
── R CMD INSTALL ───────────────────────────────────────────────────────────────────────────────────────────────────────
─  installing *source* package 'irace' ...
   ** using staged installation
   ** libs
   gcc   -O2 -Wall -gdwarf-2 -mfpmath=sse -msse2 -mstackrealign  -UNDEBUG -Wall -pedantic -g -O0 -fdiagnostics-color=always iracebin/irace.o iracebin/whereami.o -o iracebin/irace.exe
   C:\Program Files\R\rtools43/x86_64-w64-mingw32.static.posix/bin/ld.exe: iracebin/irace.o: in function `get_install_path':
   ~/irace/src/iracebin/irace.c:23: undefined reference to `__stack_chk_fail'
   C:\Program Files\R\rtools43/x86_64-w64-mingw32.static.posix/bin/ld.exe: iracebin/irace.o: in function `exec_R':
 ~/irace/src/iracebin/irace.c:49: undefined reference to `__stack_chk_fail'
   C:\Program Files\R\rtools43/x86_64-w64-mingw32.static.posix/bin/ld.exe: iracebin/whereami.o: in function `wai_getExecutablePath':
  ~/irace/src/iracebin/whereami.c:206: undefined reference to `realpath'
   C:\Program Files\R\rtools43/x86_64-w64-mingw32.static.posix/bin/ld.exe: ~/irace/src/iracebin/whereami.c:232: undefined reference to `__stack_chk_fail'
   C:\Program Files\R\rtools43/x86_64-w64-mingw32.static.posix/bin/ld.exe: iracebin/whereami.o: in function `wai_getModulePath':
  ~/irace/src/iracebin/whereami.c:278: undefined reference to `__isoc99_sscanf'
   C:\Program Files\R\rtools43/x86_64-w64-mingw32.static.posix/bin/ld.exe: ~/irace/src/iracebin/whereami.c:285: undefined reference to `realpath'
   C:\Program Files\R\rtools43/x86_64-w64-mingw32.static.posix/bin/ld.exe: ~/irace/src/iracebin/whereami.c:368: undefined reference to `__stack_chk_fail'
   collect2.exe: error: ld returned 1 exit status
   make: *** [Makevars-common:10: iracebin/irace.exe] Error 1
   ERROR: compilation failed for package 'irace'

So I used wsl to run make check, and unfortunately even on master I get the following errors:

* checking examples ... ERROR
Running examples in ‘irace-Ex.R’ failed
The error most likely occurred in:

> base::assign(".ptime", proc.time(), pos = "CheckExEnv")
> ### Name: ablation
> ### Title: Performs ablation between two configurations (from source to
> ###   target).
> ### Aliases: ablation
>
> ### ** Examples
>
> ## No test:
> logfile <- system.file(package="irace", "exdata", "sann.rda")
> # Execute ablation between the first and the best configuration found by irace.
> ablog <- ablation(logfile, ablationLogFile = NULL)
Warning: file ‘sann.rda’ has magic number '../..'
  Use of save versions prior to 2 is deprecated
Error in load(filename) :
  bad restore file magic number (file may be corrupted) -- no data loaded
Calls: ablation -> read_logfile -> load
Execution halted
* checking tests ...
  Running ‘testthat.R’
 ERROR
Running the tests in ‘tests/testthat.R’ failed.
Last 13 lines of output:
   6.           └─irace:::target.error(err.msg, output, scenario, target.runner.call = output$call) at irace/R/race-wrapper.R:291:4
   7.             └─irace:::irace.error(...) at irace/R/race-wrapper.R:98:2
  ── Failure ('test-ablation-cmdline.R:12:5'): --log-file=sann.rda ───────────────
  Expected `ablation_cmdline(...)` to run without any warnings.
  i Actually got a <simpleWarning> with text:
    file 'sann.rda' has magic number '../..'
      Use of save versions prior to 2 is deprecated
  ── Failure ('test-capping.R:70:3'): cap.irace maxTime = 1000 ───────────────────
  `cap.irace(maxTime = 1000)` did not throw the expected warning.

  [ FAIL 3 | WARN 3 | SKIP 1 | PASS 1784 ]
  Error: Test failures
  > proc.time()
     user  system elapsed
   26.146   8.136  41.984

Running make check on my branch produced the exact same errors, as expected.

MLopez-Ibanez commented 1 year ago

gcc -O2 -Wall -gdwarf-2 -mfpmath=sse -msse2 -mstackrealign -UNDEBUG -Wall -pedantic -g -O0 -fdiagnostics-color=always iracebin/irace.o iracebin/whereami.o -o iracebin/irace.exe C:\Program Files\R\rtools43/x86_64-w64-mingw32.static.posix/bin/ld.exe: iracebin/irace.o: in function get_install_path': ~/irace/src/iracebin/irace.c:23: undefined reference to__stack_chk_fail'

This seems a problem with the linker flags. I don't define any, so I am not sure what is triggering the error.

Are you using mingw32 on a 64-bits windows system, instead of mingw64?

The github actions make sure that the package builds on Windows, so it should work. For windows, you could try https://r-hub.github.io/rhub/ to build and check remotely. It is perhaps a bit slower, but it should work.

~/irace/src/iracebin/whereami.c:206: undefined reference to `realpath'

This is strange. This code has all kinds of alternatives to realpath in Windows. Why are those not being used?

So I used wsl to run make check, and unfortunately even on master I get the following errors:

No test:

logfile <- system.file(package="irace", "exdata", "sann.rda")

Execute ablation between the first and the best configuration found by irace.

ablog <- ablation(logfile, ablationLogFile = NULL) Warning: file ‘sann.rda’ has magic number '../..' Use of save versions prior to 2 is deprecated Error in load(filename) : bad restore file magic number (file may be corrupted) -- no data loaded Calls: ablation -> read_logfile -> load Execution halted

This is a different error. Something is wrong with the file sann.rda. Which version of R are you using?

Can you try from R to load the file with load("./inst/exdata/sann.rda") ?

MLopez-Ibanez commented 1 year ago

It compiles and check without errors, but you probably need a test to check that it is actually doing the right thing (and to no make our coverage even worse than it is!). See the examples in tests, which often use dummy target-runners that are very fast.

Saethox commented 1 year ago

This seems a problem with the linker flags. I don't define any, so I am not sure what is triggering the error.

Are you using mingw32 on a 64-bits windows system, instead of mingw64?

The rtools43 should come with MinGW-w64.

This is a different error. Something is wrong with the file sann.rda. Which version of R are you using?

I'm using R 4.3.1 on both Windows and wsl.

Can you try from R to load the file with load("./inst/exdata/sann.rda") ?

This also happens on both:

> load("./inst/exdata/sann.rda")
Error in load("./inst/exdata/sann.rda") :
  bad restore file magic number (file may be corrupted) -- no data loaded
In addition: Warning message:
file ‘sann.rda’ has magic number '../..'
  Use of save versions prior to 2 is deprecated
MLopez-Ibanez commented 1 year ago

I'm using R 4.3.1 on both Windows and wsl.

The package builds on Windows as you can see here: https://github.com/MLopez-Ibanez/irace/actions/runs/6187284539/job/16821365332?pr=60

You may also try https://builder.r-hub.io/ (select Windows in advanced)

> load("./inst/exdata/sann.rda")
Error in load("./inst/exdata/sann.rda") :
  bad restore file magic number (file may be corrupted) -- no data loaded
In addition: Warning message:
file ‘sann.rda’ has magic number '../..'
  Use of save versions prior to 2 is deprecated

I don't know what is going on in your computer but the file is correct in github, and the checks above show that it works in Windows, Mac and Linux. Are you sure "git" does not show changes to this file?

If your local setup does not work, you can still submit versions to r-hub for building and testing.

MLopez-Ibanez commented 1 year ago

TODO:

Saethox commented 1 year ago

I don't know what is going on in your computer but the file is correct in github, and the checks above show that it works in Windows, Mac and Linux. Are you sure "git" does not show changes to this file?

Git doesn't show any changes. This is all very mysterious.

If your local setup does not work, you can still submit versions to r-hub for building and testing.

Thanks for the tip! Guess I'll stick to building through r-hub.

Saethox commented 1 year ago

change execDir and logFile for each run.

I'm not really sure how to handle the logFile. The default behaviour of checkScenario is to set the logFile to execDir/irace.rdata, so adding the index to the execDir and logFile results in execDir/run_{i} and execDir/logFile_{i}.rdata, which means that logFile is not located in the new execDir.

It's not unreasonable to expect that the user may have called checkScenario on the scenarios supplied to multi_irace, right?

Should multi_irace detect if logFile is an absolute path and located inside the old execDir, and move it inside the new execDir? Adding the index to the logFile would also only be necessary if it is an absolute path that is not located in execDir.

MLopez-Ibanez commented 1 year ago

Should multi_irace detect if logFile is an absolute path and located inside the old execDir, and move it inside the new execDir? Adding the index to the logFile would also only be necessary if it is an absolute path that is not located in execDir.

I think this a good idea. Then we do not have to add the index to logFile. Perhaps call checkScenario() first to make everything is absolute, then use something similar to:

new_execDir <- sprintf("%s_%002d", sub("/$", "", scenario$execDir), "_", i))
scenario$logFile <- sub(scenario$execDir, new_execDir, scenario$logFile)
scenario$execDir <- new_execDir 

(I wrote the above without testing).

MLopez-Ibanez commented 1 year ago

Or use execDir/run_{i} as you are doing already, but I would prefer to print with leading zeroes like run_01 run_02, etc. because it is easier to order the folders later. It would be extremely rare to have more than 99 runs.

MLopez-Ibanez commented 1 year ago

Also, would you like to write a NEWS.md entry? I can write it for you if you are not sure what to say.

Saethox commented 1 year ago

Also, would you like to write a NEWS.md entry? I can write it for you if you are not sure what to say.

I added an entry for multi_irace. Should it be more specific?

MLopez-Ibanez commented 1 year ago

No, that is sufficient. A couple of minor things:

Thanks!

On Fri, 22 Sept 2023, 20:22 Jonathan Wurth, @.***> wrote:

Also, would you like to write a NEWS.md entry? I can write it for you if you are not sure what to say.

I added an entry for multi_irace. Should it be more specific?

— Reply to this email directly, view it on GitHub https://github.com/MLopez-Ibanez/irace/pull/60#issuecomment-1731931891, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAT7U5JENJU34TYVKJQPMA3X3XQPNANCNFSM6AAAAAA4WU7AHU . You are receiving this because you commented.Message ID: @.***>

MLopez-Ibanez commented 1 year ago

Thanks! I have merged but I also did a few minor changes after the merging. In particular, I have moved the function to its own file, since I expect it to keep growing when you implement the parallel version.

I also do not think the handling of random seeds is completely correct, since gen_random_seeds(10) returns a list of 70 values, which does not seem correct. But comparing the parallel versus sequential variants will probably shed more light into this aspect.

Looking forward to the next part!