microsoft / cppwinrt

C++/WinRT
MIT License
1.64k stars 238 forks source link

C++/WinRT could strip out or mangle C++ reserved words when generating a projection #1284

Closed DHowett closed 1 year ago

DHowett commented 1 year ago

Version

2.0.230225.1

Summary

MIDL3 does not (necessarily) reserve all of the same reserved words as C++ (template, co_await, return, ...). As a result, it is possible to generate a winmd that C++/WinRT cannot project.

Reproducible example

// IDL

namespace Foo {
    [default_interface]
    runtimeclass Bar {
        String template { get; };
    };
}

Expected behavior

I know that lowercase is not traditional in WinRT. However, consuming this IDL via a winmd via C++/WinRT should, roughly, work.

Actual behavior

C++/WinRT generates header files that use the reserved word template, and they cannot be included without introducing a compilation error.

Additional comments

/cc @asklar

DHowett commented 1 year ago

There's a comprehensive list of keywords at cppreference.

I wonder what the best way would be to handle these? Any time the projection renames a member, it risks colliding with another user-provided member. There's different mangling we could perform that reduces the incidence of collision from "likely" to "now you're just trying to mess it up."

For example:

//...
runtimeclass Foo {
    void bar();
    String template { get; };
}
//...

If we mangle template to be Template or template_renamed, somebody might (for some reason) have a third member with the same name and all heck would break loose. In that case, I would propose renaming template to template_index2 or something that encodes its position in the interface. That's less likely to collide with a well-meaning member, but still encodes enough information to figure out what it is.

kennykerr commented 1 year ago

I had to deal with this problem more comprehensively for windows-rs since Rust has a completely different set of keywords, doesn't support overloading, supports Win32 APIs, so collisions are common and windows-rs thus naturally has to come up with unique names for offending identifiers. We could do something similar here, but I'm wondering whether this is a real or theoretical problem.

As in, what is the cost of supporting keyword escaping/mangling in cppwinrt (pretty high) vs updating your WinRT types to follow accepted API guidelines and avoid such keywords?

Another option is to simply perform the mangling prior to calling MIDL, so that you have complete control over the mangling scheme.

asklar commented 1 year ago

IMO the metadata names should not be mangled since some languages might be ok with e.g. "template" (e.g. C#). It should be the job of the projection to make the metadata fit within the restrictions of the language it is projecting to (e.g. keywords).

In the example @DHowett mentioned, we ran into this problem in real life so it's not theoretical :) Agree that it's easier to rename the parameter in metadata, however we didn't catch this for a long time since we were using C#, and only now when we tried using it from C++, we ran into this problem. So the lack of mangling from C++/WinRT adds downstream risk.

sylveon commented 1 year ago

If C++/WinRT goes the way of mangling stuff, it should certainly emit a warning. E.g. warning: template is a reserved C++ keyword and cannot be used. Member Foo.Bar.template has been renamed to Foo.Bar.Template.

kennykerr commented 1 year ago

We had an internal discussion about this and while it would be good if cppwinrt could handle this more gracefully, it would be very costly to implement for C++ which has a very complex pathological case for name collisions, and we can and do avoid this by following API design guidelines to avoid such landmines. This includes guidance to avoid identifiers that conflict with keywords of widely used programming languages. I'm also working on improvements to the metadata generation process so that we can hopefully catch such issues earlier.