Currently f2py has two separate paths within f2py2e which do "almost the same thing". run_main is supposed to do everything but -c and run_compile() is supposed to compile to a module. There is a lot of technical debt being carried here, run_compile() shouldn't be doing anything different from run_main in the first place. This also leads to subtle bugs since the results of -c and regular f2py runs are often different. It also makes it hard to test the CLI well (at the very least, doubles the number of possible tests).

24552 is related, but only tangentially. Bug for bug compatibility is great, and new features are great too but this is a matter of testing. For example #25114 is difficult to test because technically the behavior should be the same with or without `-c` but it really only shows up as a bug with `-c` (to be fair it is still a bug without it but unlikely to happen in practice).

This would for example, allow better understanding / isolation of #2547 and #13553.

I guess this comment https://github.com/numpy/numpy/issues/25199#issuecomment-1829544967 is pertinent here as well. I have been looking at that part of the code in detail in the context of #25263. I am happy to contribute to this if we have general agreement on how to proceed, but I also found the distinction between run_main() and run_compile() clunky, especially since the latter calls the former.

I guess this comment #25199 (comment) is pertinent here as well. I have been looking at that part of the code in detail in the context of #25263. I am happy to contribute to this if we have general agreement on how to proceed, but I also found the distinction between run_main() and run_compile() clunky, especially since the latter calls the former.

@jmrohwer, I would really appreciate help streamlining this file. The problem is that there's global state populated in f2py2e which makes it rather hit and miss to refactor easily.

Note that (problematically) run_main is only called by run_compile for the meson backend, the distutils backend used to handle f2py specific build files internally. For the meson backend, a choice was made to make sure that the same code path is taken, which is still a source of bugs.

However, the goal would be to align them logically, that is:

run_compile shouldn't do anything run_main doesn't basically

I think for starters that would involve reducing run_compile to just calling build_backend or even removing it completely, the "right way" would be to parse all the arguments using argparse and then dispatch from there instead of the scaninputline stuff which is a legacy holdover from pre-argparse.

What were your thoughts about this?

@HaoZeke Thanks for getting back to me, I'm keen to be involved. I certainly don't have the bandwidth to refactor the whole CLI in terms of argparse and as I understood it the decision in #24552 was that this be done incrementally with any new command-line arguments being moved to argparse.

My main gripe with the current CLI is inconsistent behaviour (see also https://github.com/numpy/numpy/issues/25199#issuecomment-1829544967), specifically when using a custom edited *.pyf file to only expose some of the Fortran routines to Python. Consider mymodule.f that defines two subroutines f1() and f2(), and mymodule.pyf that only exposes the interface to f1 because we don't want to expose f2 in Python. (The original mymodule.pyf could have been generated with f2py -h but custom edited afterwards according to needs).

f2py -m mymodule mymodule.pyf mymodule.f will generate wrappers and C-code for f1 AND f2 (this was counter-intuitive to me).
In contrast f2py mymodule.pyf will generate wrappers and C-code for f1 only.
Again, in contrast (and contrary to 1. above), f2py -c mymodule.pyf mymodule.f will create wrappers and compile the module exposing f1 only!

A user may want to use f2py to generate C-code and wrappers only, and compile using their own build system (this is how I stumbled upon these inconsistencies).

A further point is that I see in the code that in run_compile the call to run_main is only done for the meson backend, not for distutils. I am not sure why this is and have not looked at the distutils backend in detail. How long will the distutils backend be supported and would any changes implemented here also have to be cross-checked against the distutils backend?

Before proceeding, we would need to agree and define exactly what is the expected behaviour for

f2py -m mymodule mymodule.pyf mymodule.f
f2py -c mymodule.pyf mymodule.f

in a situation as described.

EDIT: I see the behaviour has changed again after https://github.com/numpy/numpy/pull/25267. The behaviour described above was before this commit.

f2py mymodule.pyf mymodule.f now produces two C files: mymodulemodule.c (wraps only f1) and untitledmodule.c (wraps f1 and f2).
f2py mymodule.pyf produces the same two files, with mymodulemodule.c being the same as above but untitledmodule.c being smaller and containing no wrappers for either f1 or f2
f2py -m mymodule mymodule.pyf mymodule.f produces mymodulemodule.c that wraps both f1 and f2
f2py -c mymodule.pyf mymodule.f compiles correctly but again produces two files during the build process (mymodulemodule.c and untitledmodule.c, seems the latter is unused?)

This is confusing to say the least!

@HaoZeke Thanks for getting back to me, I'm keen to be involved. I certainly don't have the bandwidth to refactor the whole CLI in terms of argparse and as I understood it the decision in #24552 was that this be done incrementally with any new command-line arguments being moved to argparse.

Yup, that was the guideline for new flags (e.g. meson ones) but this has to do with different intermediates generated by distutils as a backend (which doesn't call run_main) and meson as a backend (which does)..

My main gripe with the current CLI is inconsistent behaviour (see also #25199 (comment)), specifically when using a custom edited *.pyf file to only expose some of the Fortran routines to Python. Consider mymodule.f that defines two subroutines f1() and f2(), and mymodule.pyf that only exposes the interface to f1 because we don't want to expose f2 in Python. (The original mymodule.pyf could have been generated with f2py -h but custom edited afterwards according to needs).
1. `f2py -m mymodule mymodule.pyf mymodule.f` will generate wrappers and C-code for `f1` AND `f2` (this was counter-intuitive to me).

2. In contrast `f2py mymodule.pyf` will generate wrappers and C-code for `f1` only.

3. Again, in contrast (and contrary to 1. above), `f2py -c mymodule.pyf mymodule.f` will create wrappers and compile the module **exposing `f1` only**!
A user may want to use f2py to generate C-code and wrappers only, and compile using their own build system (this is how I stumbled upon these inconsistencies).

This is also what the meson backend tries to do (which is why it calls run_main).

A further point is that I see in the code that in run_compile the call to run_main is only done for the meson backend, not for distutils. I am not sure why this is and have not looked at the distutils backend in detail. How long will the distutils backend be supported and would any changes implemented here also have to be cross-checked against the distutils backend?

It is difficult to say, technically until the minimum Python version supported in numpy becomes 3.12 we'd need to support it, but it is a major hindrance. distutils does quite a bit of special handling for -c which even includes different wrappers being generated even back to NumPy 1.19 as seen in https://github.com/numpy/numpy/pull/25287#issuecomment-1835986872.

Before proceeding, we would need to agree and define exactly what is the expected behaviour for

* `f2py -m mymodule mymodule.pyf mymodule.f`

* `f2py -c mymodule.pyf mymodule.f`

in a situation as described.

EDIT: I see the behaviour has changed again after #25267. The behaviour described above was before this commit.

* `f2py mymodule.pyf mymodule.f` now produces two C files: `mymodulemodule.c` (wraps only `f1`) and `untitledmodule.c` (wraps `f1` and `f2`).

* `f2py mymodule.pyf` produces the same two files, with `mymodulemodule.c` being the same as above but `untitledmodule.c` being smaller and containing no wrappers for either `f1` or `f2`

* `f2py -m mymodule mymodule.pyf mymodule.f` produces `mymodulemodule.c` that wraps both `f1` and `f2`

* `f2py -c mymodule.pyf mymodule.f` compiles correctly but again produces two files during the build process (`mymodulemodule.c` and `untitledmodule.c`, seems the latter is unused?)

This is confusing to say the least!

The intent in #25267 was to ensure untitledmodule.c is never made (I can't reproduce that locally), but something was missing there anyway, working on it over in #25287.

I think for:

f2py -m mymodule mymodule.pyf mymodule.f

This should ignore mymodule unless it is the module name defined in the .pyf file

f2py -c mymodule.pyf mymodule.f

This should behave the same as the previous, i.e. take the .pyf file as the reference.

However, unfortunately something in #25267 broke scipy (quite badly) and this shouldn't be happening. Once that's done I assume much of these problems will fine.

At this point I feel like there's little else to do but bite the bullet and ensure run_main and run_compile generate the same intermediate files, even if it takes a lot of effort. As soon as I get a pragmatic fix to scipy into main I'll start with it.

Another data point: The root cause of this is that -m and -c produce different module wrapper files #25179, and the meson backend relies on them being the same. When distutils was the backend, the difference between the working -c and the non-working -m is, for v1.19.5:

5c5
<  * Generation date: Fri Dec  1 11:42:08 2023
---
>  * Generation date: Fri Dec  1 11:38:27 2023
23c23
< typedef signed char signed_char;
---
> /*need_typedefs*/
26c26
< typedef double(*cb_fun_in___user__routines_typedef)(int *);
---
> typedef double(*cb_fun_in_foo__user__routines_typedef)(int *);
47,53d46
< #define GETSCALARFROMPYTUPLE(tuple,index,var,ctype,mess) {\
<         if ((capi_tmp = PyTuple_GetItem((tuple),(index)))==NULL) goto capi_fail;\
<         if (!(ctype ## _from_pyobj((var),capi_tmp,mess)))\
<             goto capi_fail;\
<     }
< 
< #define pyobj_from_int1(v) (PyInt_FromLong(v))
107,145d99
< static int double_from_pyobj(double* v,PyObject *obj,const char *errmess) {
<     PyObject* tmp = NULL;
<     if (PyFloat_Check(obj)) {
< #ifdef __sgi
<         *v = PyFloat_AsDouble(obj);
< #else
<         *v = PyFloat_AS_DOUBLE(obj);
< #endif
<         return 1;
<     }
<     tmp = PyNumber_Float(obj);
<     if (tmp) {
< #ifdef __sgi
<         *v = PyFloat_AsDouble(tmp);
< #else
<         *v = PyFloat_AS_DOUBLE(tmp);
< #endif
<         Py_DECREF(tmp);
<         return 1;
<     }
<     if (PyComplex_Check(obj))
<         tmp = PyObject_GetAttrString(obj,"real");
<     else if (PyString_Check(obj) || PyUnicode_Check(obj))
<         /*pass*/;
<     else if (PySequence_Check(obj))
<         tmp = PySequence_GetItem(obj,0);
<     if (tmp) {
<         PyErr_Clear();
<         if (double_from_pyobj(v,tmp,errmess)) {Py_DECREF(tmp); return 1;}
<         Py_DECREF(tmp);
<     }
<     {
<         PyObject* err = PyErr_Occurred();
<         if (err==NULL) err = callback2_error;
<         PyErr_SetString(err,errmess);
<     }
<     return 0;
< }
< 
262c216
< extern void F_FUNC(foo,FOO)(cb_fun_in___user__routines_typedef,double*);
---
> extern void F_FUNC(foo,FOO)(cb_fun_in_foo__user__routines_typedef,double*);
270,277c224,231
< /************************* cb_fun_in___user__routines *************************/
< PyObject *cb_fun_in___user__routines_capi = NULL;/*was Py_None*/
< PyTupleObject *cb_fun_in___user__routines_args_capi = NULL;
< int cb_fun_in___user__routines_nofargs = 0;
< jmp_buf cb_fun_in___user__routines_jmpbuf;
< /*typedef double(*cb_fun_in___user__routines_typedef)(int *);*/
< static double cb_fun_in___user__routines (int *i_cb_capi) {
<   PyTupleObject *capi_arglist = cb_fun_in___user__routines_args_capi;
---
> /*********************** cb_fun_in_foo__user__routines ***********************/
> PyObject *cb_fun_in_foo__user__routines_capi = NULL;/*was Py_None*/
> PyTupleObject *cb_fun_in_foo__user__routines_args_capi = NULL;
> int cb_fun_in_foo__user__routines_nofargs = 0;
> jmp_buf cb_fun_in_foo__user__routines_jmpbuf;
> /*typedef double(*cb_fun_in_foo__user__routines_typedef)(int *);*/
> static double cb_fun_in_foo__user__routines (int *i_cb_capi) {
>   PyTupleObject *capi_arglist = cb_fun_in_foo__user__routines_args_capi;
289,291c243,245
<   CFUNCSMESS("cb:Call-back function cb_fun_in___user__routines (maxnofargs=1(-0))\n");
<   CFUNCSMESSPY("cb:cb_fun_in___user__routines_capi=",cb_fun_in___user__routines_capi);
<   if (cb_fun_in___user__routines_capi==NULL) {
---
>   CFUNCSMESS("cb:Call-back function cb_fun_in_foo__user__routines (maxnofargs=1(-0))\n");
>   CFUNCSMESSPY("cb:cb_fun_in_foo__user__routines_capi=",cb_fun_in_foo__user__routines_capi);
>   if (cb_fun_in_foo__user__routines_capi==NULL) {
293c247
<     cb_fun_in___user__routines_capi = PyObject_GetAttrString(callback2_module,"fun");
---
>     cb_fun_in_foo__user__routines_capi = PyObject_GetAttrString(callback2_module,"fun");
295c249
<   if (cb_fun_in___user__routines_capi==NULL) {
---
>   if (cb_fun_in_foo__user__routines_capi==NULL) {
299,302c253,256
<   if (F2PyCapsule_Check(cb_fun_in___user__routines_capi)) {
<   cb_fun_in___user__routines_typedef cb_fun_in___user__routines_cptr;
<   cb_fun_in___user__routines_cptr = F2PyCapsule_AsVoidPtr(cb_fun_in___user__routines_capi);
<   return_value=(*cb_fun_in___user__routines_cptr)(i_cb_capi);
---
>   if (F2PyCapsule_Check(cb_fun_in_foo__user__routines_capi)) {
>   cb_fun_in_foo__user__routines_typedef cb_fun_in_foo__user__routines_cptr;
>   cb_fun_in_foo__user__routines_cptr = F2PyCapsule_AsVoidPtr(cb_fun_in_foo__user__routines_capi);
>   return_value=(*cb_fun_in_foo__user__routines_cptr)(i_cb_capi);
332c286
<   if (cb_fun_in___user__routines_nofargs>capi_i)
---
>   if (cb_fun_in_foo__user__routines_nofargs>capi_i)
346c300
<   capi_return = PyObject_CallObject(cb_fun_in___user__routines_capi,(PyObject *)capi_arglist_list);
---
>   capi_return = PyObject_CallObject(cb_fun_in_foo__user__routines_capi,(PyObject *)capi_arglist_list);
350c304
<   capi_return = PyObject_CallObject(cb_fun_in___user__routines_capi,(PyObject *)capi_arglist);
---
>   capi_return = PyObject_CallObject(cb_fun_in_foo__user__routines_capi,(PyObject *)capi_arglist);
371,372c325,326
<     GETSCALARFROMPYTUPLE(capi_return,capi_i++,&return_value,double,"double_from_pyobj failed in converting return_value of call-back function cb_fun_in___user__routines to C double\n");
<   CFUNCSMESS("cb:cb_fun_in___user__routines:successful\n");
---
>     GETSCALARFROMPYTUPLE(capi_return,capi_i++,&return_value,double,"double_from_pyobj failed in converting return_value of call-back function cb_fun_in_foo__user__routines to C double\n");
>   CFUNCSMESS("cb:cb_fun_in_foo__user__routines:successful\n");
379c333
<   fprintf(stderr,"Call-back cb_fun_in___user__routines failed.\n");
---
>   fprintf(stderr,"Call-back cb_fun_in_foo__user__routines failed.\n");
383c337
<     longjmp(cb_fun_in___user__routines_jmpbuf,-1);
---
>     longjmp(cb_fun_in_foo__user__routines_jmpbuf,-1);
388c342
< /********************* end of cb_fun_in___user__routines *********************/
---
> /******************** end of cb_fun_in_foo__user__routines ********************/
395c349
< r = foo(f,[f_extra_args])\n\nWrapper for ``foo``.\
---
> r = foo(fun,[fun_extra_args])\n\nWrapper for ``foo``.\
397c351
< "f : call-back function => fun\n"
---
> "fun : call-back function\n"
399c353
< "f_extra_args : input tuple, optional\n    Default: ()\n"
---
> "fun_extra_args : input tuple, optional\n    Default: ()\n"
408c362
< /* extern void F_FUNC(foo,FOO)(cb_fun_in___user__routines_typedef,double*); */
---
> /* extern void F_FUNC(foo,FOO)(cb_fun_in_foo__user__routines_typedef,double*); */
412c366
<                            void (*f2py_func)(cb_fun_in___user__routines_typedef,double*)) {
---
>                            void (*f2py_func)(cb_fun_in_foo__user__routines_typedef,double*)) {
417,421c371,375
<   PyObject *f_capi = Py_None;
<   PyTupleObject *f_xa_capi = NULL;
<   PyTupleObject *f_args_capi = NULL;
<   int f_nofargs_capi = 0;
<   cb_fun_in___user__routines_typedef f_cptr;
---
>   PyObject *fun_capi = Py_None;
>   PyTupleObject *fun_xa_capi = NULL;
>   PyTupleObject *fun_args_capi = NULL;
>   int fun_nofargs_capi = 0;
>   cb_fun_in_foo__user__routines_typedef fun_cptr;
423c377
<   static char *capi_kwlist[] = {"f","f_extra_args",NULL};
---
>   static char *capi_kwlist[] = {"fun","fun_extra_args",NULL};
431c385
<     capi_kwlist,&f_capi,&PyTuple_Type,&f_xa_capi))
---
>     capi_kwlist,&fun_capi,&PyTuple_Type,&fun_xa_capi))
434,436c388,390
<   /* Processing variable f */
< if(F2PyCapsule_Check(f_capi)) {
<   f_cptr = F2PyCapsule_AsVoidPtr(f_capi);
---
>   /* Processing variable fun */
> if(F2PyCapsule_Check(fun_capi)) {
>   fun_cptr = F2PyCapsule_AsVoidPtr(fun_capi);
438c392
<   f_cptr = cb_fun_in___user__routines;
---
>   fun_cptr = cb_fun_in_foo__user__routines;
441,447c395,401
<   f_nofargs_capi = cb_fun_in___user__routines_nofargs;
<   if (create_cb_arglist(f_capi,f_xa_capi,1,0,&cb_fun_in___user__routines_nofargs,&f_args_capi,"failed in processing argument list for call-back f.")) {
<     jmp_buf f_jmpbuf;
<     CFUNCSMESS("Saving jmpbuf for `f`.\n");
<     SWAP(f_capi,cb_fun_in___user__routines_capi,PyObject);
<     SWAP(f_args_capi,cb_fun_in___user__routines_args_capi,PyTupleObject);
<     memcpy(&f_jmpbuf,&cb_fun_in___user__routines_jmpbuf,sizeof(jmp_buf));
---
>   fun_nofargs_capi = cb_fun_in_foo__user__routines_nofargs;
>   if (create_cb_arglist(fun_capi,fun_xa_capi,1,0,&cb_fun_in_foo__user__routines_nofargs,&fun_args_capi,"failed in processing argument list for call-back fun.")) {
>     jmp_buf fun_jmpbuf;
>     CFUNCSMESS("Saving jmpbuf for `fun`.\n");
>     SWAP(fun_capi,cb_fun_in_foo__user__routines_capi,PyObject);
>     SWAP(fun_args_capi,cb_fun_in_foo__user__routines_args_capi,PyTupleObject);
>     memcpy(&fun_jmpbuf,&cb_fun_in_foo__user__routines_jmpbuf,sizeof(jmp_buf));
454c408
<     if ((setjmp(cb_fun_in___user__routines_jmpbuf))) {
---
>     if ((setjmp(cb_fun_in_foo__user__routines_jmpbuf))) {
457c411
<         (*f2py_func)(f_cptr,&r);
---
>         (*f2py_func)(fun_cptr,&r);
475,480c429,434
<     CFUNCSMESS("Restoring jmpbuf for `f`.\n");
<     cb_fun_in___user__routines_capi = f_capi;
<     Py_DECREF(cb_fun_in___user__routines_args_capi);
<     cb_fun_in___user__routines_args_capi = f_args_capi;
<     cb_fun_in___user__routines_nofargs = f_nofargs_capi;
<     memcpy(&cb_fun_in___user__routines_jmpbuf,&f_jmpbuf,sizeof(jmp_buf));
---
>     CFUNCSMESS("Restoring jmpbuf for `fun`.\n");
>     cb_fun_in_foo__user__routines_capi = fun_capi;
>     Py_DECREF(cb_fun_in_foo__user__routines_args_capi);
>     cb_fun_in_foo__user__routines_args_capi = fun_args_capi;
>     cb_fun_in_foo__user__routines_nofargs = fun_nofargs_capi;
>     memcpy(&cb_fun_in_foo__user__routines_jmpbuf,&fun_jmpbuf,sizeof(jmp_buf));
482c436
<   /* End of cleaning variable f */
---
>   /* End of cleaning variable fun */
548c502
< "  r = foo(f,f_extra_args=())\n"
---
> "  r = foo(fun,fun_extra_args=())\n"

So it will likely not work with meson for a bit. There should be a pragmatic enough fix to get scipy working again though.

numpy / numpy

MAINT: Unify `run_main()` and `run_compile()` #25179