Open HaoZeke opened 10 months ago
-c
but it really only shows up as a bug with -c
(to be fair it is still a bug without it but unlikely to happen in practice).This would for example, allow better understanding / isolation of #2547 and #13553.
I guess this comment https://github.com/numpy/numpy/issues/25199#issuecomment-1829544967 is pertinent here as well. I have been looking at that part of the code in detail in the context of #25263. I am happy to contribute to this if we have general agreement on how to proceed, but I also found the distinction between run_main()
and run_compile()
clunky, especially since the latter calls the former.
I guess this comment #25199 (comment) is pertinent here as well. I have been looking at that part of the code in detail in the context of #25263. I am happy to contribute to this if we have general agreement on how to proceed, but I also found the distinction between
run_main()
andrun_compile()
clunky, especially since the latter calls the former.
@jmrohwer, I would really appreciate help streamlining this file. The problem is that there's global state populated in f2py2e
which makes it rather hit and miss to refactor easily.
Note that (problematically) run_main
is only called by run_compile
for the meson
backend, the distutils
backend used to handle f2py
specific build files internally. For the meson
backend, a choice was made to make sure that the same code path is taken, which is still a source of bugs.
However, the goal would be to align them logically, that is:
run_compile
shouldn't do anything run_main
doesn't basicallyI think for starters that would involve reducing run_compile
to just calling build_backend
or even removing it completely, the "right way" would be to parse all the arguments using argparse
and then dispatch from there instead of the scaninputline
stuff which is a legacy holdover from pre-argparse
.
What were your thoughts about this?
@HaoZeke Thanks for getting back to me, I'm keen to be involved. I certainly don't have the bandwidth to refactor the whole CLI in terms of argparse
and as I understood it the decision in #24552 was that this be done incrementally with any new command-line arguments being moved to argparse
.
My main gripe with the current CLI is inconsistent behaviour (see also https://github.com/numpy/numpy/issues/25199#issuecomment-1829544967), specifically when using a custom edited *.pyf file to only expose some of the Fortran routines to Python. Consider mymodule.f
that defines two subroutines f1()
and f2()
, and mymodule.pyf
that only exposes the interface to f1
because we don't want to expose f2
in Python. (The original mymodule.pyf
could have been generated with f2py -h
but custom edited afterwards according to needs).
f2py -m mymodule mymodule.pyf mymodule.f
will generate wrappers and C-code for f1
AND f2
(this was counter-intuitive to me).f2py mymodule.pyf
will generate wrappers and C-code for f1
only.f2py -c mymodule.pyf mymodule.f
will create wrappers and compile the module exposing f1
only!A user may want to use f2py
to generate C-code and wrappers only, and compile using their own build system (this is how I stumbled upon these inconsistencies).
A further point is that I see in the code that in run_compile
the call to run_main
is only done for the meson backend, not for distutils. I am not sure why this is and have not looked at the distutils backend in detail. How long will the distutils backend be supported and would any changes implemented here also have to be cross-checked against the distutils backend?
Before proceeding, we would need to agree and define exactly what is the expected behaviour for
f2py -m mymodule mymodule.pyf mymodule.f
f2py -c mymodule.pyf mymodule.f
in a situation as described.
EDIT: I see the behaviour has changed again after https://github.com/numpy/numpy/pull/25267. The behaviour described above was before this commit.
f2py mymodule.pyf mymodule.f
now produces two C files: mymodulemodule.c
(wraps only f1
) and untitledmodule.c
(wraps f1
and f2
).f2py mymodule.pyf
produces the same two files, with mymodulemodule.c
being the same as above but untitledmodule.c
being smaller and containing no wrappers for either f1
or f2
f2py -m mymodule mymodule.pyf mymodule.f
produces mymodulemodule.c
that wraps both f1
and f2
f2py -c mymodule.pyf mymodule.f
compiles correctly but again produces two files during the build process (mymodulemodule.c
and untitledmodule.c
, seems the latter is unused?)This is confusing to say the least!
@HaoZeke Thanks for getting back to me, I'm keen to be involved. I certainly don't have the bandwidth to refactor the whole CLI in terms of
argparse
and as I understood it the decision in #24552 was that this be done incrementally with any new command-line arguments being moved toargparse
.
Yup, that was the guideline for new flags (e.g. meson
ones) but this has to do with different intermediates generated by distutils
as a backend (which doesn't call run_main
) and meson
as a backend (which does)..
My main gripe with the current CLI is inconsistent behaviour (see also #25199 (comment)), specifically when using a custom edited *.pyf file to only expose some of the Fortran routines to Python. Consider
mymodule.f
that defines two subroutinesf1()
andf2()
, andmymodule.pyf
that only exposes the interface tof1
because we don't want to exposef2
in Python. (The originalmymodule.pyf
could have been generated withf2py -h
but custom edited afterwards according to needs).1. `f2py -m mymodule mymodule.pyf mymodule.f` will generate wrappers and C-code for `f1` AND `f2` (this was counter-intuitive to me). 2. In contrast `f2py mymodule.pyf` will generate wrappers and C-code for `f1` only. 3. Again, in contrast (and contrary to 1. above), `f2py -c mymodule.pyf mymodule.f` will create wrappers and compile the module **exposing `f1` only**!
A user may want to use
f2py
to generate C-code and wrappers only, and compile using their own build system (this is how I stumbled upon these inconsistencies).
This is also what the meson
backend tries to do (which is why it calls run_main
).
A further point is that I see in the code that in
run_compile
the call torun_main
is only done for the meson backend, not for distutils. I am not sure why this is and have not looked at the distutils backend in detail. How long will the distutils backend be supported and would any changes implemented here also have to be cross-checked against the distutils backend?
It is difficult to say, technically until the minimum Python version supported in numpy
becomes 3.12
we'd need to support it, but it is a major hindrance. distutils
does quite a bit of special handling for -c
which even includes different wrappers being generated even back to NumPy 1.19
as seen in https://github.com/numpy/numpy/pull/25287#issuecomment-1835986872.
Before proceeding, we would need to agree and define exactly what is the expected behaviour for
* `f2py -m mymodule mymodule.pyf mymodule.f` * `f2py -c mymodule.pyf mymodule.f`
in a situation as described.
EDIT: I see the behaviour has changed again after #25267. The behaviour described above was before this commit.
* `f2py mymodule.pyf mymodule.f` now produces two C files: `mymodulemodule.c` (wraps only `f1`) and `untitledmodule.c` (wraps `f1` and `f2`). * `f2py mymodule.pyf` produces the same two files, with `mymodulemodule.c` being the same as above but `untitledmodule.c` being smaller and containing no wrappers for either `f1` or `f2` * `f2py -m mymodule mymodule.pyf mymodule.f` produces `mymodulemodule.c` that wraps both `f1` and `f2` * `f2py -c mymodule.pyf mymodule.f` compiles correctly but again produces two files during the build process (`mymodulemodule.c` and `untitledmodule.c`, seems the latter is unused?)
This is confusing to say the least!
The intent in #25267 was to ensure untitledmodule.c
is never made (I can't reproduce that locally), but something was missing there anyway, working on it over in #25287.
I think for:
f2py -m mymodule mymodule.pyf mymodule.f
This should ignore mymodule
unless it is the module name defined in the .pyf
file
f2py -c mymodule.pyf mymodule.f
This should behave the same as the previous, i.e. take the .pyf
file as the reference.
However, unfortunately something in #25267 broke scipy
(quite badly) and this shouldn't be happening. Once that's done I assume much of these problems will fine.
At this point I feel like there's little else to do but bite the bullet and ensure run_main
and run_compile
generate the same intermediate files, even if it takes a lot of effort. As soon as I get a pragmatic fix to scipy
into main
I'll start with it.
Another data point:
The root cause of this is that -m
and -c
produce different module wrapper files #25179, and the meson
backend relies on them being the same. When distutils
was the backend, the difference between the working -c
and the non-working -m
is, for v1.19.5
:
5c5
< * Generation date: Fri Dec 1 11:42:08 2023
---
> * Generation date: Fri Dec 1 11:38:27 2023
23c23
< typedef signed char signed_char;
---
> /*need_typedefs*/
26c26
< typedef double(*cb_fun_in___user__routines_typedef)(int *);
---
> typedef double(*cb_fun_in_foo__user__routines_typedef)(int *);
47,53d46
< #define GETSCALARFROMPYTUPLE(tuple,index,var,ctype,mess) {\
< if ((capi_tmp = PyTuple_GetItem((tuple),(index)))==NULL) goto capi_fail;\
< if (!(ctype ## _from_pyobj((var),capi_tmp,mess)))\
< goto capi_fail;\
< }
<
< #define pyobj_from_int1(v) (PyInt_FromLong(v))
107,145d99
< static int double_from_pyobj(double* v,PyObject *obj,const char *errmess) {
< PyObject* tmp = NULL;
< if (PyFloat_Check(obj)) {
< #ifdef __sgi
< *v = PyFloat_AsDouble(obj);
< #else
< *v = PyFloat_AS_DOUBLE(obj);
< #endif
< return 1;
< }
< tmp = PyNumber_Float(obj);
< if (tmp) {
< #ifdef __sgi
< *v = PyFloat_AsDouble(tmp);
< #else
< *v = PyFloat_AS_DOUBLE(tmp);
< #endif
< Py_DECREF(tmp);
< return 1;
< }
< if (PyComplex_Check(obj))
< tmp = PyObject_GetAttrString(obj,"real");
< else if (PyString_Check(obj) || PyUnicode_Check(obj))
< /*pass*/;
< else if (PySequence_Check(obj))
< tmp = PySequence_GetItem(obj,0);
< if (tmp) {
< PyErr_Clear();
< if (double_from_pyobj(v,tmp,errmess)) {Py_DECREF(tmp); return 1;}
< Py_DECREF(tmp);
< }
< {
< PyObject* err = PyErr_Occurred();
< if (err==NULL) err = callback2_error;
< PyErr_SetString(err,errmess);
< }
< return 0;
< }
<
262c216
< extern void F_FUNC(foo,FOO)(cb_fun_in___user__routines_typedef,double*);
---
> extern void F_FUNC(foo,FOO)(cb_fun_in_foo__user__routines_typedef,double*);
270,277c224,231
< /************************* cb_fun_in___user__routines *************************/
< PyObject *cb_fun_in___user__routines_capi = NULL;/*was Py_None*/
< PyTupleObject *cb_fun_in___user__routines_args_capi = NULL;
< int cb_fun_in___user__routines_nofargs = 0;
< jmp_buf cb_fun_in___user__routines_jmpbuf;
< /*typedef double(*cb_fun_in___user__routines_typedef)(int *);*/
< static double cb_fun_in___user__routines (int *i_cb_capi) {
< PyTupleObject *capi_arglist = cb_fun_in___user__routines_args_capi;
---
> /*********************** cb_fun_in_foo__user__routines ***********************/
> PyObject *cb_fun_in_foo__user__routines_capi = NULL;/*was Py_None*/
> PyTupleObject *cb_fun_in_foo__user__routines_args_capi = NULL;
> int cb_fun_in_foo__user__routines_nofargs = 0;
> jmp_buf cb_fun_in_foo__user__routines_jmpbuf;
> /*typedef double(*cb_fun_in_foo__user__routines_typedef)(int *);*/
> static double cb_fun_in_foo__user__routines (int *i_cb_capi) {
> PyTupleObject *capi_arglist = cb_fun_in_foo__user__routines_args_capi;
289,291c243,245
< CFUNCSMESS("cb:Call-back function cb_fun_in___user__routines (maxnofargs=1(-0))\n");
< CFUNCSMESSPY("cb:cb_fun_in___user__routines_capi=",cb_fun_in___user__routines_capi);
< if (cb_fun_in___user__routines_capi==NULL) {
---
> CFUNCSMESS("cb:Call-back function cb_fun_in_foo__user__routines (maxnofargs=1(-0))\n");
> CFUNCSMESSPY("cb:cb_fun_in_foo__user__routines_capi=",cb_fun_in_foo__user__routines_capi);
> if (cb_fun_in_foo__user__routines_capi==NULL) {
293c247
< cb_fun_in___user__routines_capi = PyObject_GetAttrString(callback2_module,"fun");
---
> cb_fun_in_foo__user__routines_capi = PyObject_GetAttrString(callback2_module,"fun");
295c249
< if (cb_fun_in___user__routines_capi==NULL) {
---
> if (cb_fun_in_foo__user__routines_capi==NULL) {
299,302c253,256
< if (F2PyCapsule_Check(cb_fun_in___user__routines_capi)) {
< cb_fun_in___user__routines_typedef cb_fun_in___user__routines_cptr;
< cb_fun_in___user__routines_cptr = F2PyCapsule_AsVoidPtr(cb_fun_in___user__routines_capi);
< return_value=(*cb_fun_in___user__routines_cptr)(i_cb_capi);
---
> if (F2PyCapsule_Check(cb_fun_in_foo__user__routines_capi)) {
> cb_fun_in_foo__user__routines_typedef cb_fun_in_foo__user__routines_cptr;
> cb_fun_in_foo__user__routines_cptr = F2PyCapsule_AsVoidPtr(cb_fun_in_foo__user__routines_capi);
> return_value=(*cb_fun_in_foo__user__routines_cptr)(i_cb_capi);
332c286
< if (cb_fun_in___user__routines_nofargs>capi_i)
---
> if (cb_fun_in_foo__user__routines_nofargs>capi_i)
346c300
< capi_return = PyObject_CallObject(cb_fun_in___user__routines_capi,(PyObject *)capi_arglist_list);
---
> capi_return = PyObject_CallObject(cb_fun_in_foo__user__routines_capi,(PyObject *)capi_arglist_list);
350c304
< capi_return = PyObject_CallObject(cb_fun_in___user__routines_capi,(PyObject *)capi_arglist);
---
> capi_return = PyObject_CallObject(cb_fun_in_foo__user__routines_capi,(PyObject *)capi_arglist);
371,372c325,326
< GETSCALARFROMPYTUPLE(capi_return,capi_i++,&return_value,double,"double_from_pyobj failed in converting return_value of call-back function cb_fun_in___user__routines to C double\n");
< CFUNCSMESS("cb:cb_fun_in___user__routines:successful\n");
---
> GETSCALARFROMPYTUPLE(capi_return,capi_i++,&return_value,double,"double_from_pyobj failed in converting return_value of call-back function cb_fun_in_foo__user__routines to C double\n");
> CFUNCSMESS("cb:cb_fun_in_foo__user__routines:successful\n");
379c333
< fprintf(stderr,"Call-back cb_fun_in___user__routines failed.\n");
---
> fprintf(stderr,"Call-back cb_fun_in_foo__user__routines failed.\n");
383c337
< longjmp(cb_fun_in___user__routines_jmpbuf,-1);
---
> longjmp(cb_fun_in_foo__user__routines_jmpbuf,-1);
388c342
< /********************* end of cb_fun_in___user__routines *********************/
---
> /******************** end of cb_fun_in_foo__user__routines ********************/
395c349
< r = foo(f,[f_extra_args])\n\nWrapper for ``foo``.\
---
> r = foo(fun,[fun_extra_args])\n\nWrapper for ``foo``.\
397c351
< "f : call-back function => fun\n"
---
> "fun : call-back function\n"
399c353
< "f_extra_args : input tuple, optional\n Default: ()\n"
---
> "fun_extra_args : input tuple, optional\n Default: ()\n"
408c362
< /* extern void F_FUNC(foo,FOO)(cb_fun_in___user__routines_typedef,double*); */
---
> /* extern void F_FUNC(foo,FOO)(cb_fun_in_foo__user__routines_typedef,double*); */
412c366
< void (*f2py_func)(cb_fun_in___user__routines_typedef,double*)) {
---
> void (*f2py_func)(cb_fun_in_foo__user__routines_typedef,double*)) {
417,421c371,375
< PyObject *f_capi = Py_None;
< PyTupleObject *f_xa_capi = NULL;
< PyTupleObject *f_args_capi = NULL;
< int f_nofargs_capi = 0;
< cb_fun_in___user__routines_typedef f_cptr;
---
> PyObject *fun_capi = Py_None;
> PyTupleObject *fun_xa_capi = NULL;
> PyTupleObject *fun_args_capi = NULL;
> int fun_nofargs_capi = 0;
> cb_fun_in_foo__user__routines_typedef fun_cptr;
423c377
< static char *capi_kwlist[] = {"f","f_extra_args",NULL};
---
> static char *capi_kwlist[] = {"fun","fun_extra_args",NULL};
431c385
< capi_kwlist,&f_capi,&PyTuple_Type,&f_xa_capi))
---
> capi_kwlist,&fun_capi,&PyTuple_Type,&fun_xa_capi))
434,436c388,390
< /* Processing variable f */
< if(F2PyCapsule_Check(f_capi)) {
< f_cptr = F2PyCapsule_AsVoidPtr(f_capi);
---
> /* Processing variable fun */
> if(F2PyCapsule_Check(fun_capi)) {
> fun_cptr = F2PyCapsule_AsVoidPtr(fun_capi);
438c392
< f_cptr = cb_fun_in___user__routines;
---
> fun_cptr = cb_fun_in_foo__user__routines;
441,447c395,401
< f_nofargs_capi = cb_fun_in___user__routines_nofargs;
< if (create_cb_arglist(f_capi,f_xa_capi,1,0,&cb_fun_in___user__routines_nofargs,&f_args_capi,"failed in processing argument list for call-back f.")) {
< jmp_buf f_jmpbuf;
< CFUNCSMESS("Saving jmpbuf for `f`.\n");
< SWAP(f_capi,cb_fun_in___user__routines_capi,PyObject);
< SWAP(f_args_capi,cb_fun_in___user__routines_args_capi,PyTupleObject);
< memcpy(&f_jmpbuf,&cb_fun_in___user__routines_jmpbuf,sizeof(jmp_buf));
---
> fun_nofargs_capi = cb_fun_in_foo__user__routines_nofargs;
> if (create_cb_arglist(fun_capi,fun_xa_capi,1,0,&cb_fun_in_foo__user__routines_nofargs,&fun_args_capi,"failed in processing argument list for call-back fun.")) {
> jmp_buf fun_jmpbuf;
> CFUNCSMESS("Saving jmpbuf for `fun`.\n");
> SWAP(fun_capi,cb_fun_in_foo__user__routines_capi,PyObject);
> SWAP(fun_args_capi,cb_fun_in_foo__user__routines_args_capi,PyTupleObject);
> memcpy(&fun_jmpbuf,&cb_fun_in_foo__user__routines_jmpbuf,sizeof(jmp_buf));
454c408
< if ((setjmp(cb_fun_in___user__routines_jmpbuf))) {
---
> if ((setjmp(cb_fun_in_foo__user__routines_jmpbuf))) {
457c411
< (*f2py_func)(f_cptr,&r);
---
> (*f2py_func)(fun_cptr,&r);
475,480c429,434
< CFUNCSMESS("Restoring jmpbuf for `f`.\n");
< cb_fun_in___user__routines_capi = f_capi;
< Py_DECREF(cb_fun_in___user__routines_args_capi);
< cb_fun_in___user__routines_args_capi = f_args_capi;
< cb_fun_in___user__routines_nofargs = f_nofargs_capi;
< memcpy(&cb_fun_in___user__routines_jmpbuf,&f_jmpbuf,sizeof(jmp_buf));
---
> CFUNCSMESS("Restoring jmpbuf for `fun`.\n");
> cb_fun_in_foo__user__routines_capi = fun_capi;
> Py_DECREF(cb_fun_in_foo__user__routines_args_capi);
> cb_fun_in_foo__user__routines_args_capi = fun_args_capi;
> cb_fun_in_foo__user__routines_nofargs = fun_nofargs_capi;
> memcpy(&cb_fun_in_foo__user__routines_jmpbuf,&fun_jmpbuf,sizeof(jmp_buf));
482c436
< /* End of cleaning variable f */
---
> /* End of cleaning variable fun */
548c502
< " r = foo(f,f_extra_args=())\n"
---
> " r = foo(fun,fun_extra_args=())\n"
So it will likely not work with meson
for a bit. There should be a pragmatic enough fix to get scipy
working again though.
Currently
f2py
has two separate paths withinf2py2e
which do "almost the same thing".run_main
is supposed to do everything but-c
andrun_compile()
is supposed to compile to a module. There is a lot of technical debt being carried here,run_compile()
shouldn't be doing anything different fromrun_main
in the first place. This also leads to subtle bugs since the results of-c
and regularf2py
runs are often different. It also makes it hard to test the CLI well (at the very least, doubles the number of possible tests).