fortran-lang / stdlib

Fortran Standard Library
https://stdlib.fortran-lang.org
MIT License
1.03k stars 163 forks source link

Better Interoperability with R Language #280

Open waynelapierre opened 3 years ago

waynelapierre commented 3 years ago

I have been both a Fortran and R user for a while. R's native interface with Fortran is not very straightforward to use. By straightforward, I mean compiling a Fortran function on the fly in the R environment. Meanwhile, Rcpp, R's improved interface with C++, makes it very straightforward to develop R packages with C++. However, R and C++ are very different languages.

On the other hand, R and Fortran are both numerical programming languages with 1-based indexing. Vectorization is always encouraged in R, which is consistent with the design philosophy of Fortran. I think if Fortran could have a better interface with R and its standard library could contain more stats and optimization functions, more users will use Fortran to write R packages.

ivan-pi commented 3 years ago

Hello @waynelapierre,

This sounds like a great idea! I think making it easy to interoperate with Fortran is beneficial to both languages in the end.

I had a quick look at this tutorial about Rcpp. Ironically, on slide 14 is a sketch of "The Interface" [to Fortran], which became S, which led to R. So many years later, there is so many layers of abstraction between them, that you need to write new code to find your way back...

Taking a quick look I see that the C interface to R revolves around a SEXP object (pointer). I assume this could be passed around in Fortran as a type(c_ptr). But for a smoother transition I guess it would be necessary to write some C/C++ code using the "ISO_Fortran_Bindings" introduced in F2018 to get more straightforward access into the underlying R memory. Some very basic examples of what this looks like appear in this thread. I don't really know enough about R to say more.

At least initially I would suggest exploring the R-Fortran bridge in a repository separate from stdlib. But we are more than happy to receive help with statistics and optimization routines!

certik commented 3 years ago

We plan to have automatic wrappers to other languages such as Python in LFortran. We would be happy to add an R backend also. Interoperability of Fortran with other languages is key.

waynelapierre commented 3 years ago

R and Matlab both started as a wrapper of Fortran codes and were designed as tools for engineers and statisticians. That's why R and Matlab both use 1-based indexing and have built-in data types and functions related to linear algebra. Over time, R and Matlab transitioned to use C/C++ for performance. People claim that is because Fortran does not have a good package manager and lacks good support for strings. That is not true. C/C++ does not have a good package manager, either. For most scientific computing tasks, strings are not that important. I personally think the lack of a good standard library is the reason why Fortran loses a lot of users.

If Fortran could have a good standard library and convenient interface to R. I am sure more users will use Fortran to develop R packages. For most engineers or statisticians, C/C++ is too complicated and unintuitive.

sakamoti commented 3 years ago

I have the fortran(2008) library written by myself as a dynamic library on my system, and use it from fortran and R. If fortran is the main program, I can use many fortran full features (including classes). However, when using R's fortran interface (.Fortran or .C functions) from inside R's environment, arguments can be passed only in C or f77 format (I think). (It might be the best idea to try using SEXP via c/c++, but I don't familiare with SEXP.)

I'm not familiar with c++ or c. I've intentionally written a f77 wrapper routines for R in order to use the new fortran routines (OOP and other new feature) from R. The seamless integration of fortran and R is very welcome.

ivan-pi commented 3 years ago

I just did a small experiment following the instructions on page 20 of this R newsletter. Here are the Fortran routines to start with:

! Rfun_f.f90
!
module Rfun

  use iso_c_binding
  implicit none

contains

  subroutine fooC() bind(C,name="fooC")
    ! do nothing
  end subroutine

  subroutine barC(x, len) bind(C,name="barC")
    double precision :: x
    integer :: len
    ! do nothing
  end subroutine

  type(c_ptr) function mycall(obj) bind(C,name="mycall")
    type(c_ptr), value :: obj
    myCall = obj
  end function

end module

Next, you need an interface file, here called Rfun.c with the following contents:

#include <Rinternals.h>
#include <R_ext/Rdynload.h>

// These are the routines defined in Rfun.f90
void fooC();
void barC(double *, int*);
SEXP mycall(SEXP obj);

static const R_CMethodDef cMethods[] = {
  {"fooC",  (DL_FUNC) &fooC, 0},
  {"barC", (DL_FUNC) &barC, 2},NULL};

static const R_CallMethodDef callMethods[] = {
  {"mycall", (DL_FUNC) &mycall, 1},NULL};

void R_init_Rfun(DllInfo *info)
{
  R_registerRoutines(info,cMethods,callMethods,NULL,NULL);
}

The next step is to compile the Fortran routines into a shared library using the R command line tool:

R CMD SHLIB Rfun.c Rfun_f.f90 -o Rfun.so

This creates a shared library object Rfun.so. Using the dyn.load() function in R you can load the library and access the functions using the .C and .Call primitives (no need for the .Fortran function):

$ R

R version 3.4.4 (2018-03-15) -- "Someone to Lean On"
Copyright (C) 2018 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> dyn.load("Rfun.so")
> a = 1:8
> a
[1] 1 2 3 4 5 6 7 8
> .Call("mycall",a)
[1] 1 2 3 4 5 6 7 8
> 

Granted, this only works for functions and subroutines with the the bind(C) attribute (careful with lower and upper case names!). The remaining challenge is to access the SEXP internals and gain access to the underlying R memory (actually allocated in C) in a safe and portable way.

I think it would be great if LFortran could build the necessary glue code automatically.

sakamoti commented 3 years ago

We can see the definition of SEXP in the R terminal.

file.show(file.path(R.home("include"),"Rinternals.h"))

In the Rinternals.h , SEXP is defined as a pointer of the the struct of SEXPREC.

typedef struct SEXPREC *SEXP;
#define SEXPREC_HEADER \
    struct sxpinfo_struct sxpinfo; \
    struct SEXPREC *attrib; \
    struct SEXPREC *gengc_next_node, *gengc_prev_node

/* The standard node structure consists of a header followed by the
   node data. */
typedef struct SEXPREC {
    SEXPREC_HEADER;
    union {
    struct primsxp_struct primsxp;
    struct symsxp_struct symsxp;
    struct listsxp_struct listsxp;
    struct envsxp_struct envsxp;
    struct closxp_struct closxp;
    struct promsxp_struct promsxp;
    } u;
} SEXPREC;

SEXPREC has C's union. According to "Modern fortran explained (incorporating Fortran 2018), pp.371", union can't interoperable with fortran.

No fortran type is interoperable with a C union type, a C structure type that contains a bit field, or a C structure type that contains a flexible array member.

Though I don't know how to acces the SEXP internals from fortran, I have the same opinion of @ivan-pi 's one.

The remaining challenge is to access the SEXP internals and gain access to the underlying R memory (actually allocated in C) in a safe and portable way.

ivan-pi commented 3 years ago

The SEXP (S-expression) is an opaque pointer to an underlying SEXPREC object (this is a common design pattern in C). Client code is not supposed to manipulate the SEXPREC directly, but only using methods provided by the various R headers, that provide a limited view into the node.

waynelapierre commented 3 years ago

https://cran.r-project.org/web/packages/dotCall64/index.html