pleiszenburg / zugbruecke

Calling routines in Windows DLLs from Python scripts running under Linux, MacOS or BSD
https://zugbruecke.readthedocs.io/en/latest/
GNU Lesser General Public License v2.1
111 stars 11 forks source link

memsync with pointers to pointers [R-style strings with STRSXP & CHARSXP] #42

Open chiuczek opened 6 years ago

chiuczek commented 6 years ago

I'm trying to wrap a windows dll file designed to be used with R, where any strings are being passed to the dll as char**, i.e. a pointer to a pointer to chars. I can make things work ok on windows/wine-python, but can't get memsync to synchronise the right memory when I use zugbruecke.

Minimal example:

C code:

void test(char **strings, int *unrelated) {
    strcpy(strings[0], "test");
}

Python:

import zugbruecke as ctypes
from zugbruecke import POINTER, c_char, c_int

__lib__ = ctypes.cdll.LoadLibrary("test.dll")
testfn = __lib__.test
testfn.argtypes = (POINTER(POINTER(c_char)), POINTER(c_int))
testfn.memsync = [
    { "I don't know what to use here" }
]
msg = (ctypes.c_char_p * 1)(b"                                            ")
msg_p = ctypes.cast(msg, POINTER(POINTER(c_char)))
ierr = c_int(1)

testfn(msg_p, ctypes.byref(ierr))
print(msg[0].decode('utf8'))

Is this something that is currently possible?

s-m-e commented 6 years ago

Looking at your example, it appears that you're passing a pointer to an array of pointers to null-terminated strings, right? Or is there length information somewhere given as an integer for every individual string in the array?

The short answer is that I have not implemented it yet. However, it should be reasonably easy to add. I'll see what I can do. In the meantime, you still need some information on the length of the array of pointers - i.e. how many strings are contained in your array? Somewhere in the int *unrelated section, there must be information along those lines.

chiuczek commented 6 years ago

Yes, eventually there's a null-terminated string, but there's only one of them. It's a strange interface, but I think it's because the library is designed to interface with R using R's C interface functions which specify that strings get passed as char **. The unrelated integer is entirely unrelated — the function I've mimicked returns an error message given a return number.

s-m-e commented 6 years ago

Ok, this makes (some) sense. I'll look into it and try to come up with a working example (calling into the project's demo DLL file based on your use case).

s-m-e commented 6 years ago

I started a new branch and added a test case based on your example. As expected, it passes with wine-python & ctypes and fails with zugbruecke.

Additional documentation:

A STRSXPs contains a vector of CHARSXPs, where each CHARSXP points to C-style string stored in a global pool. This design allows individual CHARSXP’s to be shared between multiple character vectors, reducing memory usage.

CHARSXPs are null-terminated. STRSXPs should be null-terminated as well, though I can not find any explicit information about it. As an intermediate step, I can assume that STRSXPs have a fixed length of 1 - which should be sufficient for your use case. Based on that, I can figure out how to support a null-terminated array of pointers.