libprima / prima

PRIMA is a package for solving general nonlinear optimization problems without using derivatives. It provides the reference implementation for Powell's derivative-free optimization methods, i.e., COBYLA, UOBYQA, NEWUOA, BOBYQA, and LINCOA. PRIMA means Reference Implementation for Powell's methods with Modernization and Amelioration, P for Powell.
http://libprima.net
BSD 3-Clause "New" or "Revised" License
296 stars 38 forks source link

`fortran/cobyla` does not pass `dtest` #41

Open zaikunzhang opened 1 year ago

zaikunzhang commented 1 year ago

With flang in AMD clang version 14.0.6, fortran/cobyla of 54e66dd does not pass dtest.

  1. With
    git checkout 54e66dd && cd fortran/tests/ && make clean && make dtest_i2_r4_d1_tst.cobyla

    we get

    
    dtest_i2_r4_d1_tst_g starts.
    Warning: COBYLA: MAXFILT is too small; it is set to 200.
    0: ALLOCATE: 395136991236 bytes requested; not enough memory
    [Inferior 1 (process 1077021) exited with code 0177]
    No stack.
    dtest_i2_r4_d1_tst ends at 2023.08.04_18.17.13.

Cleaning up miscellaneous files ... Done.

2. With

git checkout 54e66dd && cd fortran/examples/cobyla && make clean && make dtest

we get

HEAD is now at 54e66dde 230804.180548.CST fix a memory leak in example/cobyla_example.f90 /opt/AMD/aocc-compiler-4.0.0/bin/flang -Wall -Wextra -std=f2018 -Mstandard -O3 -g -c -o consts.o ../../common/consts.F90 /opt/AMD/aocc-compiler-4.0.0/bin/flang -Wall -Wextra -std=f2018 -Mstandard -O3 -g -c -o infos.o ../../common/infos.f90 /opt/AMD/aocc-compiler-4.0.0/bin/flang -Wall -Wextra -std=f2018 -Mstandard -O3 -g -c -o debug.o ../../common/debug.F90 /opt/AMD/aocc-compiler-4.0.0/bin/flang -Wall -Wextra -std=f2018 -Mstandard -O3 -g -c -o inf.o ../../common/inf.F90 /opt/AMD/aocc-compiler-4.0.0/bin/flang -Wall -Wextra -std=f2018 -Mstandard -O3 -g -c -o infnan.o ../../common/infnan.F90 /opt/AMD/aocc-compiler-4.0.0/bin/flang -Wall -Wextra -std=f2018 -Mstandard -O3 -g -c -o memory.o ../../common/memory.F90 /opt/AMD/aocc-compiler-4.0.0/bin/flang -Wall -Wextra -std=f2018 -Mstandard -O3 -g -c -o string.o ../../common/string.f90 /opt/AMD/aocc-compiler-4.0.0/bin/flang -Wall -Wextra -std=f2018 -Mstandard -O3 -g -c -o linalg.o ../../common/linalg.f90 /opt/AMD/aocc-compiler-4.0.0/bin/flang -Wall -Wextra -std=f2018 -Mstandard -O3 -g -c -o powalg.o ../../common/powalg.f90 /opt/AMD/aocc-compiler-4.0.0/bin/flang -Wall -Wextra -std=f2018 -Mstandard -O3 -g -c -o ratio.o ../../common/ratio.f90 /opt/AMD/aocc-compiler-4.0.0/bin/flang -Wall -Wextra -std=f2018 -Mstandard -O3 -g -c -o redrho.o ../../common/redrho.f90 /opt/AMD/aocc-compiler-4.0.0/bin/flang -Wall -Wextra -std=f2018 -Mstandard -O3 -g -c -o history.o ../../common/history.f90 /opt/AMD/aocc-compiler-4.0.0/bin/flang -Wall -Wextra -std=f2018 -Mstandard -O3 -g -c -o selectx.o ../../common/selectx.f90 /opt/AMD/aocc-compiler-4.0.0/bin/flang -Wall -Wextra -std=f2018 -Mstandard -O3 -g -c -o checkexit.o ../../common/checkexit.f90 /opt/AMD/aocc-compiler-4.0.0/bin/flang -Wall -Wextra -std=f2018 -Mstandard -O3 -g -c -o fprint.o ../../common/fprint.f90 /opt/AMD/aocc-compiler-4.0.0/bin/flang -Wall -Wextra -std=f2018 -Mstandard -O3 -g -c -o message.o ../../common/message.f90 /opt/AMD/aocc-compiler-4.0.0/bin/flang -Wall -Wextra -std=f2018 -Mstandard -O3 -g -c -o preproc.o ../../common/preproc.f90 /opt/AMD/aocc-compiler-4.0.0/bin/flang -Wall -Wextra -std=f2018 -Mstandard -O3 -g -c -o pintrf.o ../../common/pintrf.f90 /opt/AMD/aocc-compiler-4.0.0/bin/flang -Wall -Wextra -std=f2018 -Mstandard -O3 -g -c -o evaluate.o ../../common/evaluate.f90 /opt/AMD/aocc-compiler-4.0.0/bin/flang -Wall -Wextra -std=f2018 -Mstandard -O3 -g -c -o update.o ../../cobyla/update.f90 /opt/AMD/aocc-compiler-4.0.0/bin/flang -Wall -Wextra -std=f2018 -Mstandard -O3 -g -c -o initialize.o ../../cobyla/initialize.f90 /opt/AMD/aocc-compiler-4.0.0/bin/flang -Wall -Wextra -std=f2018 -Mstandard -O3 -g -c -o trustregion.o ../../cobyla/trustregion.f90 /opt/AMD/aocc-compiler-4.0.0/bin/flang -Wall -Wextra -std=f2018 -Mstandard -O3 -g -c -o geometry.o ../../cobyla/geometry.f90 /opt/AMD/aocc-compiler-4.0.0/bin/flang -Wall -Wextra -std=f2018 -Mstandard -O3 -g -c -o cobylb.o ../../cobyla/cobylb.f90 /opt/AMD/aocc-compiler-4.0.0/bin/flang -Wall -Wextra -std=f2018 -Mstandard -O3 -g -c -o cobyla.o ../../cobyla/cobyla.f90 /opt/AMD/aocc-compiler-4.0.0/bin/flang -Wall -Wextra -std=f2018 -Mstandard -O3 -g -o dtest cobyla_example.f90 *.o ./dtest make: [Makefile:84: dtest] Segmentation fault (core dumped) make: Deleting file 'dtest' rm inf.o preproc.o ratio.o trustregion.o redrho.o infnan.o pintrf.o debug.o geometry.o message.o evaluate.o selectx.o infos.o fprint.o string.o powalg.o cobylb.o initialize.o consts.o update.o cobyla.o history.o checkexit.o linalg.o memory.o


`gdb ./dtest THE_CORE_FILE` gives us

gdb ./dtest core.1080251 GNU gdb (Ubuntu 12.1-0ubuntu1~22.04) 12.1 Copyright (C) 2022 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: https://www.gnu.org/software/gdb/bugs/. Find the GDB manual and other documentation resources online at: http://www.gnu.org/software/gdb/documentation/.

For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from ./dtest...

[New LWP 1080251] [Thread debugging using libthread_db enabled] Using host libthread_db library "/usr/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `./dtest'. Program terminated with signal SIGSEGV, Segmentation fault.

0 calcfc_mod::calcfc_chebyquad (f=<error reading variable: Cannot access memory at address 0x0>, x=..., constr=...) at cobyla_example.f90:41

41 f = 0.0_RP (gdb) bt

0 calcfc_mod::calcfc_chebyquad (f=<error reading variable: Cannot access memory at address 0x0>, x=..., constr=...) at cobyla_example.f90:41

1 0x0000000000217479 in calcfc_internal (f_internal=<error reading variable: Cannot access memory at address 0x5>, x_internal=..., constr_internal=...) at ../../cobyla/cobylb.f90:667

2 0x000000000021c43a in evaluate_mod::evaluatefc (calcfc=0x4155415641574155, f=0.04642817229746083, x=..., constr=...) at ../../common/evaluate.f90:191

3 0x000000000022663c in initialize_mod::initxfc (calcfc=-1840700268, iprint=0, maxfun=3000, ctol=1.4901161193847656e-08, f0=0.04642817229746083, ftarget=6.9528757905928421e-310, rhobeg=6.9528757905774272e-310,

nf=-1125723076, info=-1125729508, constr0=<error reading variable: value requires 1125822181566848 bytes, which is more than max-value-size>, x0=..., chist=..., conhist=..., conmat=..., 
cval=<error reading variable: value requires 228731392 bytes, which is more than max-value-size>, fhist=..., fval=..., sim=..., simi=..., xhist=..., 
evaluated=<error reading variable: value requires 114301120 bytes, which is more than max-value-size>) at ../../cobyla/initialize.f90:161

4 0x0000000000213330 in cobylb_mod::cobylb (calcfc=, iprint=<error reading variable: Cannot access memory at address 0x5>, maxfilt=, maxfun=434684319, ctol=,

cweight=<optimized out>, eta1=<optimized out>, eta2=<optimized out>, ftarget=<optimized out>, gamma1=<optimized out>, gamma2=<optimized out>, rhobeg=<optimized out>, rhoend=<optimized out>, 
f=<optimized out>, nf=<optimized out>, cstrv=<optimized out>, info=<optimized out>, amat=..., bvec=..., constr=..., x=..., chist=..., conhist=..., fhist=..., xhist=...) at ../../cobyla/cobylb.f90:215

5 0x000000000020da6b in cobyla_mod::cobyla (calcfc=-1840700268, m_nlcon=0, f=0.04642817229746083, cstrv=0, f0=0, nf=0, rhobeg=0, rhoend=0, ftarget=0, ctol=0, cweight=0, maxfun=0, iprint=0, eta1=0, eta2=0,

gamma1=0, gamma2=0, maxhist=0, maxfilt=0, info=0, x=<error reading variable: value requires 4431462850816 bytes, which is more than max-value-size>, 
nlconstr=<error reading variable: Cannot access memory at address 0x0>, aineq=<error reading variable: value requires 241256040910333888 bytes, which is more than max-value-size>, 
bineq=<error reading variable: value requires 113962208 bytes, which is more than max-value-size>, aeq=<error reading variable: value requires 11518432 bytes, which is more than max-value-size>, beq=..., 
xl=<error reading variable: value requires 319772152347648672 bytes, which is more than max-value-size>, xu=..., 
nlconstr0=<error reading variable: value requires 241256040910333888 bytes, which is more than max-value-size>, xhist=<not allocated>, fhist=<not allocated>, chist=<not allocated>, nlchist=<not allocated>)
at ../../cobyla/cobyla.f90:600

6 0x00000000002090c7 in cobyla_exmp () at cobyla_example.f90:106

zaikunzhang commented 1 year ago

This problems occurs only to AMD flang. My guess is that the flang in AMD clang version 14.0.6 does not work well with internal subroutines used as arguments.

To be investigated.

zaikunzhang commented 1 year ago

The failure of https://github.com/zequipe/prima/actions/runs/5759832956 is due to this problem.

zaikunzhang commented 1 year ago

Update: With AOCC 4.1, the test still fails.

zaikunzhang commented 1 year ago

See the AMD Server Guru forum https://community.amd.com/t5/server-gurus-discussions/aocc-4-1-flang-quot-0-allocate-xxx-bytes-requested-not-enough/m-p/623584#M2038

Update: This problem has received attention at the AMD Server Guru forum, and people are looking into the issue.