Closed jschueller closed 1 year ago
@zaikunzhang I added a commit to introduce an experimental Python binding, in order to see if this can be useful for scipy, see the scipy discussion (xref https://github.com/scipy/scipy/issues/18118)
See the :open_file_folder: files view, the :scroll:action log or :angel: SARIF report for details.
Hi @jschueller if you would like to continue the discussion here it seems it would be better, I'll be glad to give any inputs :)
See the :open_file_folder: files view, the :scroll:action log or :angel: SARIF report for details.
So, I was looking at your proposal and indeed you could completely bypass all the C wrapper code. Let's say you wanted to test, take the interfaces you created "cobylai,etc" and in the BIND(C) add "BIND(C,name="prima_cobyla")". Shut off the C wrappers, the final shared library would contain the same functions already there, that are callable from python.
You might need in that case the abstract interfaces for the functions.
yes, the C layer only simplifies a bit some arguments (no need to pass f, nf is returned through maxfun, info is returned directly)
See the :open_file_folder: files view, the :scroll:action log or :angel: SARIF report for details.
See the :open_file_folder: files view, the :scroll:action log or :angel: SARIF report for details.
See the :open_file_folder: files view, the :scroll:action log or :angel: SARIF report for details.
See the :open_file_folder: files view, the :scroll:action log or :angel: SARIF report for details.
See the :open_file_folder: files view, the :scroll:action log or :angel: SARIF report for details.
See the :open_file_folder: files view, the :scroll:action log or :angel: SARIF report for details.
See the :open_file_folder: files view, the :scroll:action log or :angel: SARIF report for details.
See the :open_file_folder: files view, the :scroll:action log or :angel: SARIF report for details.
See the :open_file_folder: files view, the :scroll:action log or :angel: SARIF report for details.
See the :open_file_folder: files view, the :scroll:action log or :angel: SARIF report for details.
Thank you @jschueller so much for this contribution, which is highly appreciated. I am very sorry for the delayed response. Here are a few comments for your possible consideration. I am open to discussions on them.
I am not against changing the module names, as long as the new names are consistent and well thought out.
I do not know Python at all. So the Python interface would have to be reviewed by someone from the SciPy community. (Maybe @jalvesz could help?)
The last time I wrote C was more than 15 years ago. So I do not think I can review the C code properly either.
int prima_cobyla(prima_function_con calcfc, int n, int m, double *x, double rhobeg, double rhoend, int *maxfun, int iprint, void *state);
n
, m
, rhobeg
, etc be specified as constants here?nf
using maxfun
. No ... that is really what I have endeavored to avoid in the modern implementation. There were many such "tricks" in the original F77 implementation. One variable should have only one meaning and one purpose. I hope you agree that the modifications should pass all existing checking. The new Fortran code does not. In particular, it breaks the MATLAB build due to the modification made to fortran/common/pintrf.f90
. In addition, for the same reason, the Fortran code will not compile if PRIMA_REAL_PRECISION
in fortran/common/ppf.h
is not 64. This has to be fixed --- maybe by creating a dedicated pintrf.f90
(and possibly consts.F90
) for the C and Python interfaces? Indeed, similar problems have occurred in the MATLAB build, which needed a dedicated debug.F90
(see matlab/mex_gateways/debug.F90
for what I mean).
The pull request does not pass the spelling check either, but this can be fixed easily by correcting typos (if any) and/or modifying .github/actions/spelling/allow.txt
to include the false positives (if any).
The names cobyli
, newuoi
etc sound a bit vague. Could they be changed to something longer and more expressive?
In the development of PRIMA, I have imposed the "no-warning rule". I constantly compile the Fortran code using all the Fortran compilers available to me (there are twelve of them) with quite strict options and make sure that they emit no warning at all --- except for warnings due to bugs in the compilers (I have found a dozen of them in the past three years). See https://github.com/libprima/prima/blob/main/.github/workflows/lint_hosted.yml for what I mean. I take the view that warnings are more dangerous than errors, and each warning is a future bug. I hope any code added to PRIMA can follow the same rule and be accompanied by some linting facility verifying that the code is standard-conforming and warning-free under the most strict criteria.
I have made some changes to libprima/prima
since your fork. I do not think these changes have any conflict with your branch. It would be great if you could include them in your pull request.
Many many thanks again for your efforts!
Best regards, Zaikun
See the :open_file_folder: files view, the :scroll:action log or :angel: SARIF report for details.
Hi @jschueller ! I see you have included the recent changes I made to libprima/prima. Thank you very much for this quick action.
I would like to draw your attention to
https://github.com/jschueller/prima/actions
for the checking to be passed.
Thanks, Zaikun
yes, I also already adressed some of your remarks, now I'm figuring out how to handle PRIMA_REAL_PRECISION
See the :open_file_folder: files view, the :scroll:action log or :angel: SARIF report for details.
IKC
@jschueller Thank you for your efforts. Indeed, if the Fortran linting at
https://github.com/jschueller/prima/actions/workflows/lint_hosted.yml
passes, then the MATLAB build will likely work. To run the Fortran linting locally, you can do the following.
cd fortran/cobyla
bash ./flint -g
The above will lint the COBYLA code using gfortran. bash ./flint -i
will use ifort
, bash ./flint -x
will use ifx
, etc.
Dear @jschueller,
According to your commits, it seems to me that the issues are nontrivial to fix. What about the following solution?
Before the build, make a copy of the Fortran code, and modify this copy according to the need of the build (e.g., modify pintrf.f90
). Then build using this new copy. I would make this copy in a subdirectory of 'c/', and retain this copy even after the build in case they are needed for debugging in the future. Maybe make this directory hidden so that we / users do not notice it unless needed.
This is what I did for the MATLAB build, which requires specially tailored versions of debug.f90
and fprint.f90
.
In this way, there is even no need to change the module names (but again, I do not think changing them is a bad idea).
In the future, there will be likely other builds (Julia, R, etc). Inevitably, they will also need some modifications to the Fortran code. Moreover, their needs are not necessarily compatible. All of them can be handled in a similar way.
Just a naive idea for your consideration.
Best regards, Zaikun
indeed, it gets tricky around the definition of the OBJ & OBJCON callbacks for C compatibility it must use types matching C, which can be done using the iso_c module (C_INT, C_DOUBLE, ...), which causes problems with quad precision since its not standard but compiler-specific also to avoid crashes on intel compiler the x array must be defined as an assumed-shape array instead of assumed-size (ie defined as x(*) instead x(:)), but then we cannot get the size of x (as in testsuite), so the n argument must be passed too also callbacks must be marked a bind(c) which had to be propagated to the testsuite because of warnings
however I find monkeypatching the code to be dirty solution, maybe it could be worked around with preprocessing like done for the precision finally, I did not explore the solution of using a different callback definition for C (with maybe fixed precision types), which then calls the OBJ/OBJCON callback with standard types with possible conversions & extra copies, I dunno if its feasible, we probably have to store the pointer to the fonction, maybe I have to do something like done in the testsuite to redirect to the right objective function
typedef void (*prima_function_con)(const double *x, double *f, double *con);
By the way, to be consistent, I suggest always using
f
for the objective function value (a scalar)constr
for the constraint value (a vector that contains the values of all the constraint functions)cstrv
for the constraint violation value (a scalar)calfun
for the callback function that evaluates f
at a given x
calcfc
for the callback function that evaluates constr
at a given x
OBJ
(or obj
, if appropriate; can be prefixed by prima
) for the "type" of calfun
OBJCON
(or objcon
, if appropriate; can be prefixed by prima
) for the "type" of calcfc
Thanks.
Hi @jschueller
https://github.com/jschueller/prima/blob/480cada346fc289e5c98a9b752a1df752769d533/c/prima.c#L83
"failed" should be "fails" here. Thanks.
Hi @jschueller
finally, I did not explore the solution of using a different callback definition for C (with maybe fixed precision types), which then calls the OBJ/OBJCON callback with standard types with possible conversions & extra copies, I dunno if its feasible, we probably have to store the pointer to the fonction, maybe I have to do something like done in the testsuite to redirect to the right objective function
fun_ptr
) into a subroutine calfun
that follows the specification of pintrf.f90
. SeeAs you can see, calfun
calls the following evalcb_f
:
which in turn calls fmxCallMATLAB
, a subroutine provided by MATLAB to evaluate fun_ptr
:
evalcb_f
also converts x
and f
between REAL(RP)
and REAL(DP)
in the following lines, which read x
and write f
, respectively:In this way, we do not need RP
to be the same as DP
(or RPC
in the c interface)
I suppose similar things can be done when interfacing with other languages.
Thanks.
Best regards, Zaikun
I've started to implement something like that, and I saw your comment:
! This is an internal procedure that defines CALFUN. Since F2008, we
! can pass internal procedures as actual arguments. See Note 12.18
! on page 290 of WD 1539-1 J3/10-007r1 (F2008 Working Document).
! We implement CALFUN internally so that FUN_PTR is visible to it.
! Do NOT pass FUN_PTR by a module variable, which is thread-unsafe.
! use, non_intrinsic :: cbfun_mod, only : evalcb
does this mean I have to redefine CALFUN in each solver interface module (cobyli, ...) to avoid thread safety problems ?
I've started to implement something like that, and I saw your comment:
! This is an internal procedure that defines CALFUN. Since F2008, we ! can pass internal procedures as actual arguments. See Note 12.18 ! on page 290 of WD 1539-1 J3/10-007r1 (F2008 Working Document). ! We implement CALFUN internally so that FUN_PTR is visible to it. ! Do NOT pass FUN_PTR by a module variable, which is thread-unsafe. ! use, non_intrinsic :: cbfun_mod, only : evalcb
Hi @jschueller , it is great that you are pushing things forward. Thank you very much for your efforts!
! Do NOT pass FUN_PTR by a module variable, which is thread-unsafe. does this mean I have to redefine CALFUN in each solver interface module (cobyli, ...) to avoid thread safety problems ?
Hm, I am not sure what "redefine" means ... Anyway, you need to define or implement calfun
using the inputs received from the language being interfaced (here, c).
Let us look at the following hypothetical example. It defines calfun
with the help of a module variable fun_ptr
. Why don't we pass fun_ptr
as an input to calfun
directly? Because calfun
accepts only x
and f
as inputs.
module function_pointer
! The only purpose of this module is to record the value of `fun_ptr`,
! which is a function pointer. Subroutines will be able to write / read
! the value of `fun_ptr` by using this module, so that this value can
! be passed between subroutines without being an input. Why do
! we need this? Because the implementation of `calfun` needs `fun_ptr`,
! but it only accepts `x` and `f` as inputs.
pointer :: fun_ptr
end module function_pointer
subrotine calfun(x, f)
! This subroutine implements `calfun`.
! The following line makes `fun_ptr` accessible (readable).
use, non_intrinsic :: funtion_pointer, only : fun_ptr
! The next line evaluates `f` using the function represented by `fun_ptr`
! at `x`. The implementation of `evaluate` depends on the language being
! interfaced.
f = evaluate(fun_ptr, x)
end subroutine calfun
subroutine main (inputs, outputs)
! This subroutine is the real "interface". It receives the inputs from
! the language being interfaced, extracts the information needed
! by `calfun`, particularly `fun_ptr`, and then defines the outputs that
! will be passed to the language being interfaced.
! The following line makes `fun_ptr` accessible (writable).
use, non_intrinsic :: function_pointer, only : fun_ptr
! The next line extracts the value of `fun_ptr` from the inputs. It is
! possible that `fun_ptr` is simply one of the inputs. This depends on
! the language being interfaced.
fun_ptr = get_fun_ptr(inputs)
! The next line invokes `newuoa`.
call newuoa(calfun, ...)
! Finally, define the outputs.
outputs = ...
end subroutine main
This implementation should work. However, it is not thread-safe, simply because module variables are not thread-safe. Any subroutine can change fun_ptr
by using the function_pointer
module.
What is a thread-safe implementation? Define calfun
as an internal subroutine as follows (i.e., as I did for MATLAB).
subroutine main (inputs, outputs)
! This subroutine is the real "interface". It receives the inputs from
! the language being interfaced, extracts the information needed
! by `calfun`, particularly `fun_ptr`, and then defines the outputs that
! will be passed to the language being interfaced.
! Declare the function pointer.
pointer :: fun_ptr
! The next line extracts the value of `fun_ptr` from the inputs. It is
! possible that `fun_ptr` is simply one of the inputs. This depends on
! the language being interfaced.
fun_ptr = get_fun_ptr(inputs)
! The next line invokes `newuoa`.
call newuoa(calfun, ...)
! Finally, define the outputs.
outputs = ...
contains
! The lines below defines `calfun` as an internal subroutine. In this way,
! `fun_ptr` is visible to `calfun`. We do not need to pass it as an input.
subroutine calfun(x, f)
! The next line evaluates `f` using the function represented by `fun_ptr`
! at `x`. The implementation of `evaluate` depends on the language being
! interfaced.
f = evaluate(fun_ptr, x)
end subroutine calfun
end subroutine main
In this way, no module variable is needed.
I hope this is helpful. Tell me if more information is needed. Thanks.
Best regards, Zaikun
Hi @jschueller
Looking at the following code, I think I understand better what is your approach and what may be missing. (As mentioned, the last time I coded c was fifteen years ago. Excuse me if I misunderstand things).
You are relying on the fact / assumption that the c function f
will be "understood" by Fortran (or by newuoa
) as a subroutine that conforms to the interface of calfun
defined in pintrf.f90
. This may be achievable / OK in c (thanks to iso_c_binging
?), but I do not think it is such a reliable assumption when talking about interfacing in general.
A more reliable approach is to define / implement a real calfun
using the inputs received from the language being interfaced (here, the c function f
, and probably n
as well). This implementation depends on how Fortran and c talk to each other according to iso_c_binding
.
In other words, we need to wrap f
into a real calfun
before passing it to newuoa
. I hope the pseudocode in my last comment can make this point clearer.
static void f(const prima_real *x, const prima_int n, prima_real *f)
{
const prima_real x1 = x[0];
const prima_real x2 = x[1];
*f = 5*(x1-3)*(x1-3)+7*(x2-2)*(x2-2)+0.1*(x1+x2)-10;
}
prima_int rc = prima_newuoa(&f, n, x, &nf, rhobeg, rhoend, maxfun, iprint);
Hi @jschueller
Continuing with my last comment, I do believe what is missing is the following. In short, we need one more layer between c and Fortran code. Let us call this layer newuoa_for_c
, taking newuoa
as an example.
module newuoa_for_c_mod
contains
subroutine newuoa_for_c (c_objective_function, n, other_inputs, outputs)
! This is the `newuoa` subroutine that should be called by c code. Within the
! subroutine, `calfun` is defined as an internal subroutine, and the Fortran
! version of `newuoa` is invoked.
!
! Note that c code SHOULD NOT call the Fortran version of `newuoa` directly.
! Instead, another layer is needed so that the inputs and outputs can be read
! and written properly.
call newuoa(canfun, ...)
contains
! The internal subroutine that implements `calfun` using `c_objective_function` and `n`.
subroutine calfun(x, f)
call evalcb(c_objective_function, n, x, f)
end subroutine calfun
end subroutine newuoa_for_c
end module newuoa_for_c_mod
In this way, no change is needed in the Fortran code, maybe except for the change of module names as you proposed to avoid colliding.
does this mean I have to redefine CALFUN in each solver interface module (cobyli, ...) to avoid thread safety problems ?
Hm, I am not sure what "redefine" means ... Anyway, you need to define or implement calfun using the inputs received from the language being interfaced (here, c).
With the above said, I think I understand what I did not before. I hope it is the same for you.
I will be very happy to explain more if needed. Thanks.
Best regards, Zaikun
Ok, I think I got it
See the :open_file_folder: files view, the :scroll:action log or :angel: SARIF report for details.
cintrf COBJ COBJCON cobjfun cobjfuncon constrc evalcobj evalcobjcon
See the :open_file_folder: files view, the :scroll:action log or :angel: SARIF report for details.
cstatus execstack nfev
See the :open_file_folder: files view, the :scroll:action log or :angel: SARIF report for details.
cstatus execstack nfev
it should be way better now, without any modification in the existing files except for the module clash problem could you run the CI ?
thanks, it passes now
Hello,
I tried the Python interface, it was easy to compile and use.
I fixed the issue I faced, maybe the Python interface doc should be improved.
This works:
from prima import cobyla
from scipy.optimize import rosen
from array import array
n=3
m=0
def obj(x):
f=rosen(x)
return float(f), None
results=cobyla(obj, n, m, x0=array('f',(0,0,0)), rhobeg=1.0, rhoend=1e-6, maxfun=10000, iprint=2)
print("Results=",results)
for cobyla the evaluation function is supposed to return the objective value and the constraints:
def obj(x):
f=rosen(x)
print("*****x=",x,"rosen=",f)
return f, []
then it goes much further:
Results= Result(x=[0.7862963781876416, 0.6167754104876301, 0.37867099915418506], success=True, status=0, message='Trust region radius reaches its lower bound', nfev=484)
for cobyla the evaluation function is supposed to return the objective value and the constraints:
def obj(x): f=rosen(x) print("*****x=",x,"rosen=",f) return f, []
then it goes much further:
Results= Result(x=[0.7862963781876416, 0.6167754104876301, 0.37867099915418506], success=True, status=0, message='Trust region radius reaches its lower bound', nfev=484)
Yes thanks that works now !
@zaikunzhang do you think the C interface is ok for you ? I was thinking maybe it could be merged.
The Python part probably still needs discussion though, that's why I removed it for now, but I can still propose it in a later PR if needed, waiting for a pure Python implementation (#37). 0001-Add-Python-binding.txt
@zaikunzhang do you think the C interface is ok for you ? I was thinking maybe it could be merged.
The Python part probably still needs discussion though, that's why I removed it for now, but I can still propose it in a later PR if needed, waiting for a pure Python implementation (#37). 0001-Add-Python-binding.txt
Sorry for the delayed reply.
I have been working on some revisions to lincoa and cobyla, which are motivated by your interface. The revisions would imply changes to the interfaces.
In short, lincoa and cobyla will support more types of constraints natively. See
https://github.com/libprima/prima/blob/dev_cobyla/fortran/lincoa/lincoa.f90 https://github.com/libprima/prima/blob/dev_cobyla/fortran/cobyla/cobyla.f90
for details. Sorry but this takes a lot of time to code.
ok, I guess I will need to adapt when this is merged into main, feel free to ping me when it lands then
@zaikunzhang do you think the C interface is ok for you ? I was thinking maybe it could be merged. The Python part probably still needs discussion though, that's why I removed it for now, but I can still propose it in a later PR if needed, waiting for a pure Python implementation (#37). 0001-Add-Python-binding.txt
Sorry for the delayed reply.
I have been working on some revisions to lincoa and cobyla, which are motivated by your interface.
@jschueller Let me explain what "are motivated by your interface".
Ideally, the MATLAB/Python/Julia/R interfaces should allow users to provide bound/linear/nonlinear constraints in a way that is standard & native to the corresponding languages. This is what we did in PDFO. The Python interface of PDFO supports any constraint specification that is supported by SciPy (maybe @ragonneau can explain more). In addition, the MATLAB interface of PDFO and PRIMA is the same as the fmincon
function shipped by MathWorks (indeed, even more permissive; see https://github.com/libprima/prima/blob/main/matlab/interfaces/prima.m). The interface reads (and preprocesses) the constraints, and then translates them to the form that is understandable by the Fortran backend. The idea is to make the interface user-friendly and compatible with standard libraries that are already available in the languages. Otherwise, users would need to wrap the constraints themselves, which is tedious and error-prone.
For languages like Fortran and C, however, it is much more difficult to do so. Therefore, up to the current release (v0.5.1). the Fortran version of LINCOA supports only linear inequality constraints, and COBYLA supports only nonlinear inequality constraints.
Nevertheless, after seeing the Python interface, I decided to extend the flexibility of the Fortran version in terms of constraint types, so that it will be easier to code the Python/R/Julia interface, although this takes quite an effort on the Fortran side.
After this, I hope the Python interface can become more user-friendly and consistent with standard libraries in terms of constraint specification (the interface of PDFO might be a good reference).
Thanks.
Here are some suggestions regarding bobyqi.f90
. They apply to other similar files.
c
folder, not under the fortran
folder, as it is not part of the Fortran implementation. See what is done for MATLAB for example: https://github.com/libprima/prima/tree/main/matlab/mex_gateways; check also what the Python interface of PDFO does: https://github.com/pdfo/pdfo/tree/main/python/py_gateways.bobyqa_cbinding.f90
or bobyqa_c.f90
or bobyqc.f90
. Correspondingly, rename the subroutine therein to bobyqa_cbinding
or bobyqa_c
or bobyqc
. real(C_DOUBLE), intent(inout) :: x(n)
): https://github.com/jschueller/prima/blob/3a1f13e675c4ee3ea79381a42a7179b1f7de11fd/fortran/bobyqa/bobyqi.f90#L26-L28
Instead, assumed-shape arrays should be always used. They have many advantages. For example, compilers will check the shape and type of the arrays at compilation time, which will not happen if we use explicit-shape arrays. Hence, explicit-shape arrays can easily lead to bugs. See discussions at https://fortran-lang.org/en/learn/best_practices/arrays/ . That said, the x
mentioned above should be passed as
real(C_DOUBLE), intent(inout) :: x(:)
and then, in the debugging mode, check that size(x) == n
.
A function
should never have side effects, meaning that it should not change the value of any inputs, even though this is allowed by the standard. Therefore, for me, this should be a subroutine
rather than a function
(if possible).
This comment applies to any function
.
x2
, f2
should be always avoided. In this case, they should be x_loc
and f_loc
. However, these variables are not needed at all. Calling bobyqa
in the following way would work:
call bobyqa(calfun, x, f, xl=xl, xu=xu, nf=nf, rhobeg=rhobeg, rhoend=rhoend, maxfun=maxfun, iprint=iprint, info=info)
f
should also be an output. real(RP) :: f2 = 0
It does not work as you expect, it can lead to mysterious bugs, and it should NEVER BE USED. See https://github.com/j3-fortran/fortran_proposals/issues/40 The initialization of this variable is not needed anyway in this case.
https://github.com/jschueller/prima/blob/3a1f13e675c4ee3ea79381a42a7179b1f7de11fd/fortran/bobyqa/bobyqi.f90#L16 https://github.com/jschueller/prima/blob/3a1f13e675c4ee3ea79381a42a7179b1f7de11fd/fortran/bobyqa/bobyqi.f90#L50
Assume (not sure yet) that bobyqa_iso_c
will be a function. Instead of
integer(C_INT) function bobyqa_iso_c(cobjfun, n, x, xl, xu, nf, rhobeg, rhoend, maxfun, iprint) bind(C)
...
bobyqa_iso_c = info
we should declare info
as the result, e.g.,
function bobyqa_iso_c(cobjfun, n, x, xl, xu, nf, rhobeg, rhoend, maxfun, iprint) bind(C) result(info)
integer(C_INT) :: info
This applies to any function
. In other words, functions should have a result
, instead of using the function name as the result. See examples at https://github.com/libprima/prima/blob/main/fortran/common/linalg.f90 (search for function
).
Thank you very much for your attention.
@jschueller I wrote the above comments very quickly. Apologies if I overlooked something. I am open to discussions. Tell me if there is anything not clear.
Thank you very much for your great contributions.
ok, I applied all remarks, except for the array types, this is needed for C interoperability (actually it works too with gfortran but not ifort) also we do need the _loc variables to avoid compiler error on the conversion since C types and Fortran types may differ
I've added comments for these issues
This adds a C interface to the Fortran library thanks to the iso_c_binding module:
prima.h:
Some fortran modules have to be renamed to allow building everything into one single fortran library.
This also introduces a CMake build system to build both the Fortran library and the C library in a cross-platform manner.
Comments welcome