LLNL / shroud

Shroud: generate Fortran and Python wrappers for C and C++ libraries
BSD 3-Clause "New" or "Revised" License
90 stars 7 forks source link

Function returning string* type #314

Closed CarvFS closed 1 year ago

CarvFS commented 1 year ago

Hello!

First of all: thank you very much for developing shroud!

I am working in a code which needs to access a string pointer. I have managed it to work when the string pointer points to a single string. However, I would like to do it with a string pointer poiting to a array of strings.

Looking the scripts available here I have found one which seems to be what I need:

In strings.yaml : - decl: const std::string *get_str_ptr() +deref(pointer)

However, I am getting an error saying:

Wrote tutorial_types.yaml
Error with template: 'int {c_var_len}' 

and (I bilieve) it is due to +deref(pointer) option.

I will put the scripts to generate the necessary files and run my test example below:

tutorial.yaml

library: Tutorial
format:
  F_filename_suffix: F90
cxx_header: tutorial.hpp

declarations:
- decl: namespace tutorial
  declarations:
  - decl: class Class1  
    declarations:
    - decl: Class1() +name(new)
    - decl: void printvalues()
    - decl: int* get_int_ptr(int *len +intent(out)+hidden) +deref(pointer) +dimension(len)

    ####################### Modify here #######################
    # - decl: const std::string *get_str_ptr() +deref(pointer)
    ###########################################################

    - decl: ~Class1() +name(delete)

tutorial.hpp

#ifndef CLASS1_HPP
#define CLASS1_HPP
#include <iostream>
#include<string.h>
using namespace std;

namespace tutorial {

    struct str1{
        int *iptr;
        string *names;
        int str_size;
        int iptr_size;
    };
    typedef struct str1 str1;

    class Class1
    {
    public:
      Class1();

      void printvalues();

      int* get_int_ptr(int *len);

      //////////////// Modify here ////////////////
      const string* get_str_ptr();
      ////////////////////////////////////////////

      ~Class1();

    };

}

#endif // CLASS1_HPP

tutorial.cpp

#include <iostream>
#include "tutorial.hpp"
#include <sstream>

namespace tutorial {
    static str1 s;

    Class1 :: Class1(){
        cout << "Object is being created!" << endl;   
        s.iptr_size = 4;
        s.str_size = 3;

        s.iptr = new int[s.iptr_size];
        s.names = new string[s.str_size];

        for(int i = 0; i < s.iptr_size; i++){
            s.iptr[i] = i+123;
        }

        s.names[0] = "test1";
        s.names[1] = "test2";
        s.names[2] = "test3";

    }

    void Class1 :: printvalues(){
        cout << "From printvalues:" << endl;
        for(int i = 0; i < s.iptr_size; i++){
            cout << "   iptr[" << i << "] = " << s.iptr[i] << endl;
        }

        for(int i = 0; i < s.str_size; i++){
            cout << "   names[" << i << "] = " << s.names[i] << endl;
        }
    }

    int* Class1 :: get_int_ptr(int *len){
        *len = s.iptr_size;
        return s.iptr;
    }

    //////////////// Modify here ////////////////
    const string* Class1 :: get_str_ptr(){
        return s.names;
    }
    ////////////////////////////////////////////

    Class1 :: ~Class1(){
        cout << "Object is being deleted!" << endl;
    }

}

test_shroud.F90

program test_shroud
    use iso_c_binding
    use tutorial_tutorial_mod
    type(class1) cptr
    integer, pointer :: iptr_f(:)
    !!!!! Create object
    cptr = Class1() 
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

    call class1_printvalues(cptr)

    !!> Getting int* data from C++
    iptr_f => class1_get_int_ptr(cptr)
    !!> Print
    write(*,*) iptr_f

    !!> Want to print array os strings:
    ! write(*,*) class1_get_str_ptr(cptr)

    !!!!! Delete object
    call cptr%delete()
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
end program test_shroud

Makefile

# the compiler: gcc for C program, define as g++ for C++
CC = g++
GF90 = gfortran
LDLIBS  = -lgfortran -lstdc++ 

test_shroud: tutorial.o wraptutorial_Class1.o wrapfTutorial_tutorial.o test_shroud.F90
    $(GF90) tutorial.o wraptutorial_Class1.o wrapfTutorial_tutorial.o test_shroud.F90 -o main $(LDLIBS)

wrapfTutorial_tutorial.o: wraptutorial_Class1.o wrapfTutorial_tutorial.F90
    $(GF90) -c wrapfTutorial_tutorial.F90

wraptutorial_Class1.o: wraptutorial_Class1.cpp tutorial.o
    $(CC) -c wraptutorial_Class1.cpp

tutorial.o: tutorial.cpp
    $(CC) -c tutorial.cpp

I would like to get the pointer to strings as I have got the pointer to integers.

Thank you in advance!

CarvFS commented 1 year ago

Hi :)

Did anyone had the chance to take a look on this?

CarvFS commented 1 year ago

I have managed to get the array of strings from C++ :) I have used the char** type. I had to concatenate all words as a single string (adding spaces for words with less than 4 characters) before passing it to Fortran. I do not know if it is the best way for doing this, but If anyone want to test/modify it I am posting an update for the code I have posted previously.

Sorry about these long posts... I just want to make sure anyone would be able to replicate my case. Each necessary file is given below:

tutorial.yaml

library: Tutorial
format:
  F_filename_suffix: F90
cxx_header: tutorial.hpp

splicer:
  f:
  - fsplicer.f

declarations:
- decl: namespace tutorial
  declarations:
  - decl: class Class1  
    declarations:
    - decl: Class1() +name(new)
    - decl: void set_strings()
    - decl: void printvalues()
    - decl: int* get_int_ptr(int *len +intent(out)+hidden) +deref(pointer) +dimension(len)

    ####################### Modify here #######################
    - decl: void get_strs(char** strs +deref(pointer), int* name_len +hidden);
      fstatements:
        f:
          f_module: 
            iso_c_binding: ["C_LOC","C_F_POINTER"]
    ###########################################################

    - decl: ~Class1() +name(delete)

fsplicer.f

! splicer begin namespace.tutorial.class.Class1.method.get_strs
type(C_PTR) cstr
cstr = C_LOC(strs)
call c_class1_get_strs_bufferify(obj%cxxmem, cstr, name_len)
call C_F_POINTER(cstr,strs,[2])
print*,strs(1),strs(2),name_len
! splicer end namespace.tutorial.class.Class1.method.get_strs

tutorial.hpp

#ifndef CLASS1_HPP
#define CLASS1_HPP
#include <iostream>
#include<string.h>
using namespace std;

namespace tutorial {

    struct str1{
        int *iptr;
        string *names;
        int str_size;
        int iptr_size;
    };
    typedef struct str1 str1;

    class Class1
    {
    public:
      Class1();

      void set_strings();

      void printvalues();

      int* get_int_ptr(int *len);

      //////////////// Modify here ////////////////
      // const string* get_str_ptr();
      void get_strs(char** strs, int* name_len);
      ////////////////////////////////////////////

      ~Class1();

    };

}

#endif // CLASS1_HPP

tutorial.cpp

#include <iostream>
#include "tutorial.hpp"
// #include "NewClass.hpp"
#include <sstream>

namespace tutorial {
    static str1 s;

    Class1 :: Class1(){
        cout << "Object is being created!" << endl;   
        s.iptr_size = 4;
        s.str_size = 4;

        s.iptr = new int[s.iptr_size];

        for(int i = 0; i < s.iptr_size; i++){
            s.iptr[i] = i+123;
        }

    }

    void Class1 :: set_strings(){
        s.names = new string[s.str_size];
        s.names[0] = "Lucy";
        s.names[1] = "Mina";
        s.names[2] = "Jo";
        s.names[3] = "Ruth";

        // Fill names with less than 4 characters with spaces
        for(int i = 0; i < s.str_size; i++){
            s.names[i].insert(s.names[i].end(), 4 - s.names[i].size(), ' ');
        }
    }

    void Class1 :: printvalues(){
        cout << "From printvalues:" << endl;
        for(int i = 0; i < s.iptr_size; i++){
            cout << "   iptr[" << i << "] = " << s.iptr[i] << endl;
        }

        for(int i = 0; i < s.str_size; i++){
            cout << "   names[" << i << "] = " << s.names[i] << endl;
        }
    }

    int* Class1 :: get_int_ptr(int *len){
        *len = s.iptr_size;
        return s.iptr;
    }

    //////////////// Modify here ////////////////
    void Class1 :: get_strs(char** strs, int* name_len){
        cout << "====== In: get_strs =======" << endl;
        *name_len = s.str_size;
        char *buf = new char[s.str_size*4];
        for(int i = 0; i < *name_len; i++){
            if(i == 0){
                strcpy(buf,const_cast<char*>(s.names[i].c_str()));
            }
            else{
                strcat(buf,const_cast<char*>(s.names[i].c_str()));
            }
        }
        *strs = buf;
        cout << "====== Out: get_strs =======" << endl;
    }
    ////////////////////////////////////////////

    Class1 :: ~Class1(){
        cout << "Object is being deleted!" << endl;
    }

}

test_shroud.F90

program test_shroud
    use iso_c_binding
    use tutorial_tutorial_mod
    type(class1) cptr
    character(len = 4), pointer :: names(:)

    !!!!! Create object
    cptr = Class1() 

    call class1_set_strings(cptr)
    call class1_printvalues(cptr)

    !!> Getting int* data from C++
        !!> Print 1D array
    write(*,*) class1_get_int_ptr(cptr)

        !!> Get array of strings
    call class1_get_strs(cptr,names)

        !!> Verify the values retrieved from C++
    call test_receive_str(names,4)

    !!!!! Delete object
    call cptr%delete()
end program test_shroud

subroutine test_receive_str(fstrs, names_size)
    integer :: names_size
    character(len = 4) :: fstrs(names_size)
    do i = 1,names_size
        write(*,*) "in subroutine: String returned from C++: ", fstrs(i)
    end do
end subroutine test_receive_str

Makefile

CC = g++
GF90 = gfortran
LDLIBS  = -lgfortran -lstdc++ 

test_shroud: tutorial.o wraptutorial_Class1.o wrapfTutorial_tutorial.o test_shroud.F90
    $(GF90) tutorial.o wraptutorial_Class1.o wrapfTutorial_tutorial.o test_shroud.F90 -o main $(LDLIBS)

wrapfTutorial_tutorial.o: wraptutorial_Class1.o wrapfTutorial_tutorial.F90
    $(GF90) -c wrapfTutorial_tutorial.F90

wraptutorial_Class1.o: wraptutorial_Class1.cpp tutorial.o
    $(CC) -c wraptutorial_Class1.cpp

tutorial.o: tutorial.cpp
    $(CC) -c tutorial.cpp
ltaylor16 commented 1 year ago

Thank you for your patience and the complete example. I was able build the example.

I'm glad you were able to get something running. But it does have some issues I'm sure you're aware of. The lengths are fixed at 4 wide and the C++ function needs to do the blank filling itself. In addition, the memory allocated at tutorial.cpp line 55, char *buf = new char[s.str_size*4] will leak. The Fortran pointer has the address so you can still delete it but it has to be done by a C++ delete.

There is an example in regression/input/vectors.yaml that does something similar to what you have:

- decl: void vector_string_fill(std::vector< std::string > &arg+intent(out))
  options:
    wrap_c: False
    wrap_fortran: False

However, you'll notice that the c and fortran wrappers are not being generated. This was more of a note-to-self so I'd get back to it and make it run eventually.

The biggest issues with strings is the blank-filled vs null-terminated which requires a copy to make things look natural in Fortran. Also the fact that Fortran has arrays of strings where each string is expected to be the same len, unlike char ** which has ragged arrays. If your real code is like your example, a few very short strings, I find it easier to pass in a character(*) arg(:) array which the wrapper can then fill up. That way you don't have to worry about memory leaking. But that of course assumes you know the size ahead of time.

I'll be out for a couple of weeks, but I'll make the vector_string_fill example work when I get back. And there are probably some other similar cases that can be addressed at the same time. My goal is to make the Fortran wrappers natural to a Fortran programmer without having to write any custom code. Ideally the wrapper would do the blank fill that you've had to add explicitly in your example.

CarvFS commented 1 year ago

Thank you very much for your reply and comments! I had hardcoded the lengths, just to get it working. I modified it so the length is received as a hidden argument, so the user do not need to provide It by hand:

- decl: void get_strs(char** strs +deref(pointer), int* name_len +hidden, int str_len +hidden);
      fstatements:
        f:
          f_module: 
            iso_c_binding: ["C_LOC","C_F_POINTER"]

and in the splicer I have added:

! splicer begin namespace.tutorial.class.Class1.method.get_strs
type(C_PTR) cstr
cstr = C_LOC(strs)
str_len = len(strs)
call c_class1_get_strs_bufferify(obj%cxxmem, cstr, name_len)
call C_F_POINTER(cstr,strs,[2])
print*,strs(1),strs(2),name_len
! splicer end namespace.tutorial.class.Class1.method.get_strs

So it will recognize the character pointer length on fortran side and allocate the right amount of space in memory:

char *buf = new char[s.str_size*str_len];

Also str_len variable was also added in the set_names so the right amount of blank spaces on c++ side can be added if the length on main Fortran program is changed:

- decl: void set_strings(int str_len +hidden)

(I have added the str_len = len(strs) in the fsplicer file for this function too.)

void Class1 :: set_strings(int str_len){
        s.names = new string[s.str_size];
        s.names[0] = "Lucy";
        s.names[1] = "Mina";
        s.names[2] = "Jo";
        s.names[3] = "Ruth";

        // Fill names with less than 4 characters with spaces
        for(int i = 0; i < s.str_size; i++){
            s.names[i].insert(s.names[i].end(), str_len - s.names[i].size(), ' ');
        }
}

Now both set and get strings does not have the character length hardcoded anymore :) It does not remove the blank filling on c++ side, but at least make it more automatic so one does not need to worry about this step. I have tried several strategies to get it working and this was the only one I could think of (but did not spend much time to optimize it...).

The strings in the actual code are short, less than 4 characters in length. However, the quantity of strings can vary depending on the case.

Thank you for pointing the memory leak issue. I will take a look on it!

I will be waiting your updates! :)

ltaylor16 commented 1 year ago

I've added changes to the develop branch which will make it easier to deal with std::string ** arguments. The C++ functions only need to assign to the dereferenced pointer arguments. There is no need to deal with blank padding.

void Class1 :: get_strs2(string **strs, int* name_len){
    cout << "====== In: get_strs2 =======" << endl;
    *name_len = s.str_size;
    *strs = s.names;
    cout << "====== Out: get_strs2 =======" << endl;
}

This is wrapped as:

- decl: void get_strs2(
            string** strs +intent(out)+dimension(name_len)+deref(allocatable),
            int* name_len +intent(out)+hidden);

The generated Fortran requires an allocatable argument. I'm using the Fortran object oriented syntax to call the function.

character(len=:), allocatable :: anames(:)
call cptr%get_strs2(anames)

It's also possible to create a wrapper which will fill an existing variable.

- decl: void get_strs2(
            string** strs +intent(out)+dimension(name_len),
            int* name_len +intent(out)+hidden);

The generated Fortran requires a local variable.

character(len=4) :: snames(10)
call cptr%get_strs2(snames)

One the plus side, there is no need to allocate memory. But on the minus side, you must know the maximum possible output. Any strings which will not fit into the Fortran variable will be truncated.

CarvFS commented 1 year ago

Great! Thank you very much!!!

I will take a look on the modifications you have made and ask if I have any questions :)