j3-fortran / fortran_proposals

Proposals for the Fortran Standard Committee
175 stars 14 forks source link

BIND (python) #261

Open thomas-robinson opened 2 years ago

thomas-robinson commented 2 years ago

Binding routines in C is great. I would like to see more languages that could use the BIND specifier.

function pyfunction (args) result (res) BIND(python, name="pyfunction")

With an increasing number of machine learning algorithms being written in python, utilizing python is going to be important to HPC applications that are written in Fortran in the future.

wyphan commented 2 years ago

This is a brilliant idea, since it removes an intermediary C layer (Cython or Numba) and lets Python interface directly to Fortran.

IMHO Fortran has to be the secret sauce to NumPy being fast even though Python is an interpreted language, because AFAIK NumPy leverages either MKL or OpenBLAS to do the heavy lifting.

Summoning @pearu (Quansight) here.

On Fri, May 13, 2022, 08:31 Tom Robinson @.***> wrote:

Binding routines in C is great. I would like to see more languages that could use the BIND specifier.

function pyfunction (args) result (res) BIND(python, name="pyfunction")

With an increasing number of machine learning algorithms being written in python, utilizing python is going to be important to HPC applications that are written in Fortran in the future.

— Reply to this email directly, view it on GitHub https://github.com/j3-fortran/fortran_proposals/issues/261, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMERY5FUGIAE6LPDSCWAEB3VJZDP7ANCNFSM5V3HYN3A . You are receiving this because you are subscribed to this thread.Message ID: @.***>

certik commented 2 years ago

Great idea. Note that there are two directions:

See the above two issues for examples of syntax. For calling Python from Fortran (the first issue), I had in mind some syntax of the kind:

use, external(python) :: numpy, only: sin
print *, sin(5)

But using your proposal, it could also be just:

interface
    function sin(x) result(r) bind(python, name="numpy.sin")
    real, intent(in) :: x
    real :: r
    end function
end interface
print *, sin(5.0)

And one would declare it by hand to work for arrays (of all dimensions) as well. One can then create numpy.f90 Fortran module which would have all these declarations (and possibly making the function sin generic over all scalars and arrays), so that it is easy to use.

h-vetinari commented 2 years ago

This is a brilliant idea, since it removes an intermediary C layer (Cython or Numba) and lets Python interface directly to Fortran. IMHO Fortran has to be the secret sauce to NumPy being fast even though Python is an interpreted language, because AFAIK NumPy leverages either MKL or OpenBLAS to do the heavy lifting.

Indeed both numpy and scipy rely on BLAS/LAPACK; both tend to use openblas by default, but are fully functional with MKL (in fact, it's the default in conda-forge on windows) as well as the netlib/blis flavours. The use of BLAS/LAPACK is indeed a key pillar of the performance and will not change.

Additionally, scipy has a non-trivial amount of native fortran code (and a lot of historical packaging/distribution complications stem from that; see also the entire f2py effort), though nowadays the native fortran tends to get slowly replaced by C/C++ code (often through cython/pythran transpilation) not least due to a shortage of reviewer's expertise (resp. availability of the few that do know).

Summoning @pearu (Quansight) here.

Since Pearu can be hard to reach, adding some more numpy/scipy maintainers for visibility: @rgommers @seberg @tylerjereddy

ivan-pi commented 2 years ago

I'm interested how this would work from the perspective of the Fortran processor (jargon for Fortran compiler). From the perspective of Fortran calling Python would a Fortran processor embed Python (either CPython, PyPy, or perhaps HPy)? This would mean inserting calls to Py_Initialize() and Py_FinalizeEx() and handling all the data-conversion using functions from Python's C API when needed. Or would it use a static Python compiler such as Cython, Nuitka, LPython, and others?

Would interoperability be limited to numeric scalars, and strings, NumPy arrays covering compatible NumPy data types or more ambitiously, would it also cover Python sequence types including lists and tuples, and mapping types such as dictionaries, etc.?

Could decorators be used on the Python side to expose functions and classes for use in Fortran (i.e. generate the Fortran module, interfaces, and wrapper code automatically)? What about Python type annotations; could these be used for static typing?

For Python calling Fortran on the other hand, bind(Python) would effectively be like a "standardized" form of the F2PY tool and language extensions?

It might also be wise to follow the development of the Python array API standard, and see how it can work with Fortran interoperability.

certik commented 2 years ago

@ivan-pi excellent questions. For LFortran I want to pursue what I always wanted as a user, that is, "all the way". So all Fortran features would be wrapped, and all (or almost all) Python features would be usable. And yes, all the things you mentioned have to be resolved, and there is more than one way to resolve them.

But for a standard, I would start at the lowest denominator, that is, just annotating Fortran functions (to be called from Python) and Fortran interfaces (for Python functions to be called from Fortran). The standard would not specify how the compiler should actually implement it, it would only standardize the syntax and how it should behave in Fortran from the user's perspective. However, even this minimal approach has challenges: what features to allow, so initially integers/floats and arrays. But if arrays, are we going to interoperate with NumPy only, or other packages too? Etc.

A feature like this should require a prior compiler implementation and experience to make sure we like all the details before standardizing.

HaoZeke commented 2 years ago

Thanks for bringing this up! @pearu and I had a short chat about this. I'm going to summarize some of the discussion (any mistakes / omissions are mine).

From a technical / standards perspective I'm not sure this proposal would hold a lot of water.

As far as the Fortran language is concerned, Python is essentially a high level C library currently, and given the differences in language design (interpreted / compiled) this seems to be the only feasible way forward.

A better candidate might be bind(cpp) which would also allow the language to grow very rapidly.

I personally do not believe it is healthy for the language standard to special case interacting with a sub-project like Cython or the other statically compiled variants of Python, since they implement a subset of Python; though in theory one might propose bind(cython) as an approach.

Similarly, typing in Python is 100% optional, so for the same reason it would be difficult to describe a language binding which requires it.

Note that none of these concerns are valid for user-binding generation codes (f2py, f90wrap, fmodpy, lfortran etc.), which have the happy circumstance of being able to define their own scope to be more narrow than the language specification.

Also, as @certik pointed out, we should have working implementations and experience. By that measure, this might be a bit premature. There are many approaches gaining traction jointly and converging towards a consolidated image of what Python looks like when interoperated with Fortran. Some are:

These are just some off the top of my head, and apologies to any projects I missed. Once these user-land approaches have stabilized (and efforts are ongoing to keep communication open so there's no fragmentation of the ecosystem), then and only then would it make sense to have the standards committee consider such a proposal.

Apart from ease of user-experience, at this moment I do not think there are any intrinsic functions or compatibility requests / helpers (e.g. those found in iso_c_binding) which would be required (and are not covered by existing stipulations).

(not including features requested for other purposes which would help this, like generics support)

certik commented 2 years ago

I agree with @HaoZeke. In addition however I will also say that as a user, I think I would want this. Even more, as a user, I just want to use any Python library and I don't even want to write the wrappers and everything should just work. And from Python I just want to import any Fortran library and everything should just work.

jeffhammond commented 2 years ago

The reason Fortran can interoperate with C is that C is unique in having a well-defined ABI. Python doesn't even require types, so it can't begin to have an ABI, so the task proposed here is impossible. The only reasonable way to interoperate Python and Fortran is to use the C interoperability capability of both.

ivan-pi commented 2 years ago

Python (or specifically CPython) does have a stable C ABI: https://docs.python.org/3/c-api/stable.html#stable-application-binary-interface

Of course it's not an ISO standard like C and Fortran, but many of us do not care in practice.

klausler commented 2 years ago

The reason Fortran can interoperate with C is that C is unique in having a well-defined ABI. Python doesn't even require types, so it can't begin to have an ABI, so the task proposed here is impossible. The only reasonable way to interoperate Python and Fortran is to use the C interoperability capability of both.

I think that the point of having Python <-> Fortran interoperability is to automate and hide the use of the C ABI so that it's not a big deal to haul data between the languages or to invoke procedures, especially Python calling Fortran.

thomas-robinson commented 2 years ago

If C is interoperable with python, and C is interoperable with Fortran, then there should be some way to eliminate the C middle man. As a user, it would provide a lot of convenience and allow for easy usage of python that is being used for accelerated computing.

jeffhammond commented 2 years ago

I'd like to see a proper implementation of this in one of the major Fortran compilers and a report on the implementation difficulty before considering this.

FortranFan commented 2 years ago

a proper implementation of this in one of the major Fortran compilers

What is a "major Fortran compiler"? It's a rhetorical question! The very notion of it sounds rather discriminatory and unconvincing.

HaoZeke commented 2 years ago

a proper implementation of this in one of the major Fortran compilers

What is a "major Fortran compiler"? It's a rhetorical question! The very notion of it sounds rather discriminatory and unconvincing.

Perhaps lets leave this as something to be revisited when there is a complete set of bindings of enough language features in any user-land library or compiler.

HaoZeke commented 2 years ago

If C is interoperable with python, and C is interoperable with Fortran, then there should be some way to eliminate the C middle man. As a user, it would provide a lot of convenience and allow for easy usage of python that is being used for accelerated computing.

The problem is that this is not exactly true. Not all of Python is compatible with Fortran in a transparent way.

Also, C is not completely interoperable with Fortran, and the primary argument is still that from a language perspective, unless we would like to make statements about the GC and other interpreted features, binding Python-C should be left to user-code.

Currently some common features can be emulated / bound, and perhaps many more can be, but to say that there is immediately a way to embed an interpreted language with a compiled one because they have bindings to a common language might be premature.

perazz commented 2 years ago

I also endorse bind(cpp) to be a far better alternative than bind(python) that would enable plenty of code for IO, GUI, network, you name it to be called to and from Fortran in a native way, without having to go through C middleware.

C binding is excellent to bind Fortran to essentially ANY other languages, but I think binding C++ and Fortran directly would make a lot more sense:

klausler commented 2 years ago

I hereby correct you, as requested.

First, Fortran's derived types are not a "subset of C++ classes". You can do things with derived types that you cannot with classes, and vice versa. You might be able to make something work by considering the intersection of the sets of capabilities of the two languages, but that intersection does not contain Fortran's derived types. (Examples of Fortran features not shared by C++ include, but are not limited to: distinct final subroutines for arrays, defined I/O, PASS arguments other than the first, default component initialization without constructors, "automatic" components dependent on LEN type parameters, ...)

Second, the implementation of derived type features in Fortran is often (but not necessary) tied to the implementation of the descriptors used for dummy arguments, allocatables, and/or pointers, so that an object of a polymorphic derived type can provide the dynamic type (and its parameters) to generated code and the runtime support library. The common descriptor mandated for interoperability in the standard, however, does not provide for any dynamic non-intrinsic type information. (The interoperable descriptor format is also not interoperable between Fortran implementations, shockingly enough -- the standard mandates that certain fields be present in the structure, but not their layout or enumeration values.)

jeffhammond commented 2 years ago

Show me how Fortran arrays are a subset of C++ and I'll consider the class argument.

perazz commented 2 years ago

Thank you @klausler, that's a very clear summary. I didn't mean to generalize to a full compatibility. Same goes for bind(C) where severe limitations also apply, including the impossibility to use arrays. Talking about classes, I think most of your observations could be very good defining properties of the bind(cpp) binding, for example, interoperable classes may not have:

I understand things are not so easy implementation-wise, but some things are not standardized yet, that would be a good opportunity to do it (in the same way that C binding forces Fortran objects to have the same C ABI, for example in a DLL).

@jeffhammond Regarding arrays, one could have a C++ equivalent to iso_Fortran_binding.h, where templated classes for the integer, real, complex arrays are defined at least. Would be a fantastic way to make Fortran arrays available in C++, but I understand I'm dreaming a bit too much here...

klausler commented 2 years ago

Elemental FINAL subroutines are actually the one case of FINAL that would work, if any did.

I have many more items on my mental list of Fortran features that do not apply to C++ classes. I don't think that Fortran/C++ interoperability that's any better than BIND(C) is a tractable problem. If you disagree, I invite you to implement a demonstrable prototype yourself.