Open fitzgen opened 7 years ago
Perhaps foo1
, foo2
, ... would be better than trying to mangle the template argument type name into the symbol. As the author of a C++ symbol demangling library, I can attest that mangling is more complicated than it seems at first blush.
Perhaps foo1, foo2, ... would be better than trying to mangle the template argument type name into the symbol
So long as foo2
consistently refers to foo<int>
(or whatever), even if the source gets rearranged or foo1
/foo<char>
gets deleted or replaced.
I suppose it doesn't matter that much if the rearrangement always causes a compile failure on the Rust side, since editing the code should be straightforward if we assume that the C++ FFI code is constrained to a small area which has stable interfaces to everything else in Rustland.
I'm running into a situation where a library declares some template functions in .h
without defining them, then defines and instantiates the known monomorphizations in .cpp
(one that uses float
and another using double
). Since other .cpp
files in that same library can refer to the header file without needing them to be instantiated, I would assume that Rust-usable bindings could be generated for them.
Things I've tried:
wrapper.hpp
like: void myFun(float* arr); void myFun(double* arr);
(without template
keyword): Generates 2 bindings per myFun
with specific function names, linker error running tests - undefined reference to (name of function)
- implies that the link_name
generated in bindings.rs
is wrong.template
keyword: Fails generating bindings: multiple wrapper.hpp:47:15: error: explicit instantiation of undefined function template 'myFun', err: true
extern template
- fails to generate any bindings for those functions, and subsequently fails rust compilation due to unresolved imports
.Forward declaring specific functions in wrapper.hpp like: void myFun(float arr); void myFun(double arr); (without template keyword): Generates 2 bindings per myFun with specific function names, linker error running tests - undefined reference to (name of function) - implies that the link_name generated in bindings.rs is wrong.
That doesn't work, it's not the same and as those functions aren't defined the linker errors are expected.
Same thing, but prefixed with template keyword: Fails generating bindings: multiple wrapper.hpp:47:15: error: explicit instantiation of undefined function template 'myFun', err: true
That's a clang error, which means that it's not valid C++. You should be able to explicitly instantiate some of them, but it seems that by the time you get there myFun
is not defined, so you also need to include the file that defines the template.
That would still not work, but is fixable. That's what this issue is about. I'm happy to mentor it.
Same thing, but now prefixed with extern template - fails to generate any bindings for those functions, and subsequently fails rust compilation due to unresolved imports.
I think that would also work with this issue fixed. It just needs to be fixed.
Thanks, that sounds like the cleanest solution. It look like (and it seems reasonable) instantiated function templates just have a different linker name than an identical free function with the same function name and type signature (the reason my option 1 failed).
I'm not sure (yet) exactly how bindgen deals with different compilers, but while we could inspect source code for how gcc and clang mangle instantiated template functions (with extern template
without definition, or template
with definition), is it known how MSVC mangles them?
I suppose in the meantime, including a .cpp file that declares c-bindable functions that use the template functions might even result in the same library code, given inlining, and have no overhead when used from Rust.
Correct me if I'm wrong, but I think this can't be implemented in bindgen at the moment due to a limitation of libclang.
I believe there is currently no cursor kind for template instantiations in libclang. When looking at the clang AST, you get the following for a function template declaration:
`-FunctionTemplateDecl 0x540ce0 <line:14:1, col:35> col:24 foo
|-TemplateTypeParmDecl 0x540aa0 <col:10, col:16> col:16 referenced class depth 0 index 0 T
|-FunctionDecl 0x540c38 <col:19, col:35> col:24 foo 'void (T)'
| |-ParmVarDecl 0x540b40 <col:28, col:30> col:30 t 'T'
| `-CompoundStmt 0x540e00 <col:33, col:35>
|-FunctionDecl 0x55e528 <col:19, col:35> col:24 foo 'void (char)'
| |-TemplateArgument type 'char'
| | `-BuiltinType 0x4f8fb0 'char'
| `-ParmVarDecl 0x55e460 <col:28, col:30> col:30 t 'char':'char'
`-FunctionDecl 0x55e868 <col:19, col:35> col:24 foo 'void (int)'
|-TemplateArgument type 'int'
| `-BuiltinType 0x4f9010 'int'
`-ParmVarDecl 0x55e7a8 <col:28, col:30> col:30 t 'int':'int
This includes the explicit instantiations of the foo
function as FunctionDecl
s. However, when dumping the libclang CXCursor, we get something that looks like this:
(FunctionTemplate
(TemplateTypeParameter)
(ParmDecl
(TypeRef)
)
(CompoundStmt)
)
The template instantiations are not included in the children, in contrast to the clang AST.
If I'm right about this, we would need to propose changes to libclang before support for explicitly instantiated function templates can be implemented in bindgen.
Perhaps slightly off-topic, but would it theoretically be possible to support explicitly instantiated method templates?
For example, given this:
template <typename T>
class Test {
public:
int bar(T x);
};
extern template int Test<char>::bar(char x);
extern template int Test<int>::bar(int x);
I think we might be able to generate something like this (although it might require a more sophisticated mangling scheme):
#[repr(C)]
#[derive(Debug, Copy, Clone)]
pub struct test<T> {
pub t: T,
pub _phantom_0: ::std::marker::PhantomData<::std::cell::UnsafeCell<T>>,
}
extern "C" {
#[link_name = "<test char bar mangled name>"]
pub fn test_char_bar(
this: *mut test<u8>,
x: u8,
) -> u8;
#[link_name = "<test int bar mangled name>"]
pub fn test_int_bar(
this: *mut test<c_int>,
x: ::std::os::raw::c_int,
) -> ::std::os::raw::c_int;
}
impl test<u8> {
pub unsafe fn bar(
&mut self,
x: u8
) -> u8 {
test_char_bar(self, x)
}
}
impl test<c_int> {
pub unsafe fn bar(
&mut self,
x: c_int,
) -> c_int {
test_int_bar(self, x)
}
}
This would require similar support from libclang. A proposal for this has already been made: https://reviews.llvm.org/D43763. Internally this is considered template specialization in clang, but despite the lack of such specialization in Rust, I don't think there is a problem as long there are no specializations that change the fields of the class like in https://github.com/rust-lang/rust-bindgen/issues/24.
from the rust side there's a limitation due to the lack of specialization but we should be able to support a small subset of it as you mention. Sadly i'm not that knowledgeable of C++ templates and what would be needed to properly support this on bindgen.
I can't say I'm very knowledgable of C++ templates myself either. I'm mostly just curious of what's possible.
I do wonder now if the lack of specialization is necessarily a problem for function/method templates. Consider the example from #24 with some methods added:
template <typename T>
class Test {
int bar(char x);
};
template<>
class Test<int> {
int foo;
public:
int bar_a(int x);
};
template<>
class Test<float> {
float foo;
public:
int bar_b(float x);
};
bindgen would currently generate a single type that represents the generic one, and the fields from the specializations would not be accessible. However, I don't think that's a problem for the methods, since you could still generate something like this, unless I'm mistaken:
#[repr(C)]
#[derive(Debug, Copy, Clone)]
pub struct Test<T> {
pub _address: u8,
pub _phantom_0: ::std::marker::PhantomData<::std::cell::UnsafeCell<T>>,
}
extern "C" {
// bindings for methods of all instantiations with mangled link names
}
impl<T> Test<T> {
fn bar(&mut self, x: u8) -> i32 { /* ... */ }
}
impl Test<i32> {
fn bar_a(&mut self, x: i32) -> i32 { /* ... */ }
}
impl Test<f32> {
fn bar_b(&mut self, x: f32) -> i32 { /* ... */ }
}
But I can tell that this only really works in limited cases. If bar
, bar_a
and bar_b
would have the same signature int bar(int x);
, there would be duplicate definitions. Maybe it would make sense to use a trait for the bindings, e.g.
pub trait Bar {
fn bar(&mut self, x: i32) -> i32;
}
impl<T> Bar for Test<T> { /* ... */ }
impl Bar for Test<i32> { /* ... */ }
impl Bar for Test<f32> { /* ... */ }
which would not be valid due to the lack of impl specialization in rust, and it's probably not something bindgen should ever generate anyway.
I'm sorry if my brainstorming here is a bit off-topic. I do feel like there is indeed a subset that could be supported, which might be interesting, but I can't really define what this subset would be exactly due to how complex C++ templates are and how little I know about them. Perhaps any non-specialized template class, method or function could be supported in theory?
We can't invoke the C++ compiler to generate new instantiations of templates, so I don't think we even try to keep track of them currently.
However, we could track explicit instantiations of template functions (14.7.2 in the standard) and make bindings to those.
For example, if given
We could generate something like:
But either way, we can't create new template instantiations, so we shouldn't even try to generate bindings to generic functions based on the template, or to instantiations for which we haven't seen an explicit instantiation.