hsutter / cppfront

A personal experimental C++ Syntax 2 -> Syntax 1 compiler
Other
5.27k stars 225 forks source link

feat: evaluate program-defined metafunctions (based on #797) #907

Open JohelEGP opened 6 months ago

JohelEGP commented 6 months ago

feat: evaluate program-defined metafunctions (based on #797)

A metafunction is normal Cpp2 code compiled as part of a library. When parsing a declaration that @-uses the metafunction, the library is loaded and the metafunction invoked on the declaration.

The reflection API is available by default to Cpp2 code (via cpp2util.h). The implementation of the API is provided by the cppfront executable. For this to work, compiling cppfront should export its symbols (for an explanation, see https://cmake.org/cmake/help/latest/prop_tgt/ENABLE_EXPORTS.html).

For cppfront to emit program-defined metafunctions, the environment variable CPPFRONT_METAFUNCTION_LIBRARY should be set to the library's path.

For cppfront to load program-defined metafunctions, the environment variable CPPFRONT_METAFUNCTION_LIBRARIES should be set to the :-separated library paths of the used metafunctions.

Here is an example of program-defined metafunctions. The commands were cleaned up from the CMake buildsystem in #797.

metafunctions.cpp2:

greeter: (inout t: cpp2::meta::type_declaration) = {
  t.add_member($R"(say_hi: () = std::cout << "Hello, world!\nFrom (t.name())$\n";)");
}

main.cpp2:

my_class: @greeter type = { }
main: ()                = my_class().say_hi();

Build cppfront:

g++ -std=c++20 cppfront.cpp -o cppfront
    # Note: check that we don't need to specify these flags explicitly, if they're defaults
    # g++ -std=c++20 -o cppfront.cpp.o -c cppfront.cpp
    # g++ -Wl,--export-dynamic -rdynamic cppfront.cpp.o -o cppfront

Build metafunctions:

CPPFRONT_METAFUNCTION_LIBRARY=libmetafunctions.so ./cppfront metafunctions.cpp2
g++ -std=c++20 -fPIC -o metafunctions.cpp.o -c metafunctions.cpp
g++ -fPIC -shared -Wl,-soname,libmetafunctions.so -o libmetafunctions.so metafunctions.cpp.o

Build and run main:

CPPFRONT_METAFUNCTION_LIBRARIES=libmetafunctions.so ./cppfront main.cpp2
g++ -std=c++20 main.cpp -o main
./main

Output:

metafunctions.cpp2... ok (all Cpp2, passes safety checks)

main.cpp2... ok (all Cpp2, passes safety checks)

Hello, world!
From my_class
JohelEGP commented 6 months ago

The design paper will come later.

JohelEGP commented 6 months ago

I opened #909 for the design write-up.

JohelEGP commented 6 months ago

Now reflect.h2 doesn't have a dependency on parse.h. It is compiled as a pure Cpp2 header, taking advantage of https://github.com/hsutter/cppfront/issues/594#issuecomment-1793627053. reflect_impl.h2 has the remaining bits that depend on parse.h.

To regenerate reflect.h2, use

cppfront -p reflect.h2 -o cpp2reflect.h
mv cpp2reflect.h ../include/

I use std::any values to build the compilation firewall. The implementation, cpp2reflect.hpp, is #included at the end of reflect_impl.h2. This is why odr-uses of the reflection API requires linking to cppfront.

DyXel commented 6 months ago

If depending on Boost.DLL is undesirable, loading and using shared libraries is actually quite simple:

For POSIX:

For Windows:

It doesn't get much more complicated than that if you just want to call C-named functions. Let me know if you'd like me to support on this.

JohelEGP commented 6 months ago

Let me know if you'd like me to support on this.

Thank you. I think we would all like this.

It would help to not have to depend on Boost.DLL if there is an implementation for the current platform.

For GCC, I can use my system's Boost.DLL, But for Clang, I have to build its dependencies from source due to ABI (https://github.com/hsutter/cppfront/discussions/797#discussioncomment-7492847).

DyXel commented 6 months ago

Alright, I'll write the changes and open PR against the branch in your repo, we can discuss further over there once its up 👍🏻

JohelEGP commented 6 months ago

The diff in QtCreator better shows how reflect.h2 changed into cpp2reflect.h2:

Diff ```diff --- source/reflect.h2 +++ source/cpp2reflect.h2 @@ -15,7 +15,6 @@ // Reflection and meta //=========================================================================== -#include "parse.h" cpp2: namespace = { @@ -33,58 +32,51 @@ compiler_services: @polymorphic_base @copyable type = { // Common data members // - errors : *std::vector; - errors_original_size : int; - generated_tokens : *std::deque; - parser : cpp2::parser; - metafunction_name : std::string = (); - metafunction_args : std::vector = (); - metafunctions_used : bool = false; + data_: std::any; + private data: (this) -> forward _ = std::any_cast>(data_); + private data: (inout this) -> forward _ = std::any_cast>(data_); // Constructor // operator=: ( out this, - errors_ : *std::vector, - generated_tokens_: *std::deque + data_v: std::any ) = { - errors = errors_; - errors_original_size = cpp2::unsafe_narrow(std::ssize(errors*)); - generated_tokens = generated_tokens_; - parser = errors*; + data_ = data_v; + assert( data_.type() == Typeid(), "parameter 'data_v' must store a 'compiler_services_data'" ); } // Common API // set_metafunction_name: (inout this, name: std::string_view, args: std::vector) = { - metafunction_name = name; - metafunction_args = args; - metafunctions_used = args.empty(); + data().metafunction_name = name; + data().metafunction_args = args; + data().metafunctions_used = args.empty(); } - get_metafunction_name: (this) -> std::string_view = metafunction_name; + get_metafunction_name: (this) -> std::string_view = data().metafunction_name; get_argument: (inout this, index: int) -> std::string = { - metafunctions_used = true; - if (0 <= index < metafunction_args.ssize()) { - return metafunction_args[index]; + data().metafunctions_used = true; + if (0 <= index < data().metafunction_args.ssize()) { + return data().metafunction_args[index]; } return ""; } get_arguments: (inout this) -> std::vector = { - metafunctions_used = true; - return metafunction_args; + data().metafunctions_used = true; + return data().metafunction_args; } - arguments_were_used: (this) -> bool = metafunctions_used; + arguments_were_used: (this) -> bool = data().metafunctions_used; protected parse_statement: ( inout this, copy source: std::string_view ) - -> (ret: std::unique_ptr) + -> _ = { original_source := source; @@ -116,7 +108,7 @@ compiler_services: @polymorphic_base @copyable type = // Now lex this source fragment to generate // a single grammar_map entry, whose .second // is the vector of tokens - _ = generated_lexers.emplace_back( errors* ); + _ = generated_lexers.emplace_back( data().errors* ); tokens := generated_lexers.back()&; tokens*.lex( lines*, true ); @@ -124,20 +116,18 @@ compiler_services: @polymorphic_base @copyable type = // Now parse this single declaration from // the lexed tokens - ret = parser.parse_one_declaration( + ret := data().parser.parse_one_declaration( tokens*.get_map().begin()*.second, - generated_tokens* + data().generated_tokens* ); if !ret { error( "parse failed - the source string is not a valid statement:\n(original_source)$"); } + return ret; } - position: (virtual this) - -> source_position - = { - return (); - } + protected position: (this) std::any_cast(vposition()); + protected vposition: (virtual this) -> std::any = source_position(); // Error diagnosis and handling, integrated with compiler output // Unlike a contract violation, .requires continues further processing @@ -156,10 +146,10 @@ compiler_services: @polymorphic_base @copyable type = error: (this, msg: std::string_view) = { message := msg as std::string; - if !metafunction_name.empty() { - message = "while applying @(metafunction_name)$ - (message)$"; + if !data().metafunction_name.empty() { + message = "while applying @(data().metafunction_name)$ - (message)$"; } - _ = errors*.emplace_back( position(), message); + _ = data().errors*.emplace_back( position(), message); } // Enable custom contracts on this object, integrated with compiler output @@ -167,7 +157,7 @@ compiler_services: @polymorphic_base @copyable type = // report_violation: (this, msg) = { error(msg); - throw( std::runtime_error(" ==> programming bug found in metafunction @(metafunction_name)$ - contract violation - see previous errors") ); + throw( std::runtime_error(" ==> programming bug found in metafunction @(data().metafunction_name)$ - contract violation - see previous errors") ); } has_handler:(this) true; @@ -206,7 +196,7 @@ type_id: @polymorphic_base @copyable type = template_args_count : (this) -> int = n*.template_arguments().ssize(); to_string : (this) -> std::string = n*.to_string(); - position: (override this) -> source_position = n*.position(); + protected vposition: (override this) -> std::any = n*.position(); } */ @@ -224,20 +214,36 @@ declaration_base: @polymorphic_base @copyable type = { this: compiler_services = (); - protected n: *declaration_node; + node_pointer: @copyable type = + { + n: std::any; + + operator=: ( + implicit out this, + n_: T + ) + = { + n = n_; + assert( n_, "a meta::declaration must point to a valid declaration_node, not null" ); + static_assert( std::is_same_v ); + } + + operator*: (this) -> forward _ = std::any_cast<*declaration_node>(n)*; + } + + protected n: node_pointer; protected operator=: ( out this, - n_: *declaration_node, + n_: node_pointer, s : compiler_services ) = { compiler_services = s; n = n_; - assert( n, "a meta::declaration must point to a valid declaration_node, not null" ); } - position: (override this) -> source_position = n*.position(); + protected vposition: (override this) -> std::any = n*.position(); print: (this) -> std::string = n*.pretty_print_visualize(0); } @@ -252,7 +258,7 @@ declaration: @polymorphic_base @copyable type = operator=: ( out this, - n_: *declaration_node, + n_: declaration_base::node_pointer, s : compiler_services ) = { @@ -334,7 +340,7 @@ function_declaration: @copyable type = operator=: ( out this, - n_: *declaration_node, + n_: declaration_base::node_pointer, s : compiler_services ) = { @@ -349,7 +355,7 @@ function_declaration: @copyable type = has_move_parameter_named : (this, s: std::string_view) -> bool = n*.has_move_parameter_named(s); first_parameter_name : (this) -> std::string = n*.first_parameter_name(); - has_parameter_with_name_and_pass: (this, s: std::string_view, pass: passing_style) -> bool + has_parameter_with_name_and_pass: (this, s: std::string_view, pass) -> bool = n*.has_parameter_with_name_and_pass(s, pass); is_function_with_this : (this) -> bool = n*.is_function_with_this(); is_virtual : (this) -> bool = n*.is_virtual_function(); @@ -421,7 +427,7 @@ object_declaration: @copyable type = operator=: ( out this, - n_: *declaration_node, + n_: declaration_base::node_pointer, s : compiler_services ) = { @@ -457,7 +463,7 @@ type_declaration: @copyable type = operator=: ( out this, - n_: *declaration_node, + n_: declaration_base::node_pointer, s : compiler_services ) = { @@ -592,7 +598,7 @@ alias_declaration: @copyable type = operator=: ( out this, - n_: *declaration_node, + n_: declaration_base::node_pointer, s : compiler_services ) = { @@ -1338,110 +1344,6 @@ print: (t: meta::type_declaration) = } -//----------------------------------------------------------------------- -// -// apply_metafunctions -// -apply_metafunctions: ( - inout n : declaration_node, - inout rtype : type_declaration, - error - ) - -> bool -= { - assert( n.is_type() ); - - // Check for _names reserved for the metafunction implementation - for rtype.get_members() - do (m) - { - m.require( !m.name().starts_with("_") || m.name().ssize() > 1, - "a type that applies a metafunction cannot have a body that declares a name that starts with '_' - those names are reserved for the metafunction implementation"); - } - - // For each metafunction, apply it - for n.metafunctions - do (meta) - { - // Convert the name and any template arguments to strings - // and record that in rtype - name := meta*.to_string(); - name = name.substr(0, name.find('<')); - - args: std::vector = (); - for meta*.template_arguments() - do (arg) - args.push_back( arg.to_string() ); - - rtype.set_metafunction_name( name, args ); - - // Dispatch - // - if name == "interface" { - interface( rtype ); - } - else if name == "polymorphic_base" { - polymorphic_base( rtype ); - } - else if name == "ordered" { - ordered( rtype ); - } - else if name == "weakly_ordered" { - weakly_ordered( rtype ); - } - else if name == "partially_ordered" { - partially_ordered( rtype ); - } - else if name == "copyable" { - copyable( rtype ); - } - else if name == "basic_value" { - basic_value( rtype ); - } - else if name == "value" { - value( rtype ); - } - else if name == "weakly_ordered_value" { - weakly_ordered_value( rtype ); - } - else if name == "partially_ordered_value" { - partially_ordered_value( rtype ); - } - else if name == "struct" { - cpp2_struct( rtype ); - } - else if name == "enum" { - cpp2_enum( rtype ); - } - else if name == "flag_enum" { - flag_enum( rtype ); - } - else if name == "union" { - cpp2_union( rtype ); - } - else if name == "print" { - print( rtype ); - } - else { - error( "unrecognized metafunction name: " + name ); - error( "(temporary alpha limitation) currently the supported names are: interface, polymorphic_base, ordered, weakly_ordered, partially_ordered, copyable, basic_value, value, weakly_ordered_value, partially_ordered_value, struct, enum, flag_enum, union, print" ); - return false; - } - - if ( - !args.empty() - && !rtype.arguments_were_used() - ) - { - error( name + " did not use its template arguments - did you mean to write '" + name + " <" + args[0] + "> type' (with the spaces)?"); - return false; - } - } - - return true; -} - - } } ```
JohelEGP commented 6 months ago

Thanks to the contribution of https://github.com/JohelEGP/cppfront/pull/1 by @DyXel and @edo9300 now we directly use the OS APIs for loading libraries. This means that Boost.DLL is no longer required, and a cppfront compiled as usual will have support for loading metafunctions. You still need to specify the libraries with metafunctions via an environment variable. I have updated the opening comment, which should be used as commit message when merging, to reflect this.

JohelEGP commented 6 months ago

There's an issue with regards to symbol visibility.

On Windows, symbols are not exported by default. For Visual Studio, we need to use one of four methods to export symbols (https://learn.microsoft.com/en-us/cpp/build/reference/export-exports-a-function?view=msvc-170).

In Cpp1, exporting a symbol usually entails decorating its declaration with a portable macro (here CPPFRONTAPI). Unfortunately, there's no way to specify that in the Cpp2 source reflect.h2. The other three methods for VS happen outside the source code, which means manually listing the declarations.

Unlike GCC and Clang, it seems like VS doesn't offer an option to export all symbols. So we need Cpp2 support to export library symbols, or to use one of the other three VS-specific methods, or, equivalently, a patch that applies the export macro to the generated declarations.

NB: It is compiling cppfront which should export the symbols as it compiles the definitions in cpp2reflect.hpp.

JohelEGP commented 6 months ago

So we need Cpp2 support to export library symbols,

I have another use for such a feature.

In my case, I'm using an implementation of @rule_of_zero (https://github.com/hsutter/cppfront/pull/808). Then I'm defining @sfml, which uses it (https://github.com/hsutter/cppfront/discussions/789#discussioncomment-7572313).

When compiling with the hidden visibility preset, loading the metafunction sfml fails. It works when compiling the module that declares sfml because it imports the module that exports rule_of_zero. But on load, the rule_of_zero in its body is diagnosed as an undefined symbol, as it isn't visible. Manually adding CPPFRONTAPI to the lowered declaration of rule_of_zero makes it work.

JohelEGP commented 6 months ago

I'm thinking of just adding a metafunction to lower a declaration with the CPPFRONTAPI macro.

There are similarities to export declarations of C++ modules (which will be an access-specifier in Cpp2, see https://github.com/hsutter/cppfront/issues/269#issuecomment-1464790572). I asked on the Cpplang Slack on how it relates to symbol visibility, but they are orthogonal features (see https://cpplang.slack.com/archives/C92GZLCSE/p1700658714626929).

Relatedly, we might eventually want to lower extern "C" declarations.

624 is also interested in using @c_api for something else.

Anyways, I'll try to think of some fitting name to get things moving here. Suggestions are welcome!

DyXel commented 6 months ago

When compiling with the hidden visibility preset, loading the metafunction sfml fails. It works when compiling the module that declares sfml because it imports the module that exports rule_of_zero. But on load, the rule_of_zero in its body is diagnosed as an undefined symbol, as it isn't visible. Manually adding CPPFRONTAPI to the lowered declaration of rule_of_zero makes it work.

I guess technically you could use the C declaration for this case, but you'd need to do the casting to void*, plus namespace handling would be out of the window. Or, since you can detect the signature of a metafunction already, metafunctions detected within a body of another could be lowered to the specific C-magic. Both very ugly hacks but could work for a POC.

Indeed there needs to be a way to mark stuff to use C-linkage and/or declare its symbol visibility (in general, a way of spelling this specific stuff, I am sure there are more out there), but I wonder, should that functionality be covered as you mention with a metafunction (c_api)? I didn't think of using them like that, feels like that is not what metafunctions are intended for, and you are still modifying the signature of a function within cpp2 limits, no? It should error out saying its not valid code.

Edit: Saw the commit, of course adding a flag to modify the lowering behavior, that works! But I still am questioning whether metafunctions are the right tool for this.

JohelEGP commented 6 months ago

With commit 7107644ead0d837307d1cd5183119258e11c1938, by declaring rule_of_zero with @visible when using the hidden visibility preset, sfml loads successfully and everything works as when using the default visibility preset.

JohelEGP commented 6 months ago

Now I just need to overload if for types, use it in reflect.h2, and then VS will work (https://github.com/hsutter/cppfront/pull/907#issuecomment-1870757843).

JohelEGP commented 6 months ago

I guess technically you could use the C declaration for this case, but you'd need to do the casting to void*, plus namespace handling would be out of the window. Or, since you can detect the signature of a metafunction already, metafunctions detected within a body of another could be lowered to the specific C-magic. Both very ugly hacks but could work for a POC.

The undefined symbol being that of a metafunction is a coincidence. The general issue is that Cpp2, without @visible, can't be used to author a DLL (with proper hygiene, i.e., hidden visibility by default, like the Windows default). A metafunction couldn't use a name declared in another Cpp2 TU.

JohelEGP commented 6 months ago

Now I can use a cppfront compiled with the hidden visibility preset. With it, my uses of metafunctions in my project work just fine. So the Visual Studio issue should be solved now (https://github.com/hsutter/cppfront/pull/907#issuecomment-1870757843).

JohelEGP commented 6 months ago

The windows CI

JohelEGP commented 6 months ago

I have a plan to solve the name lookup problem for good using only the current source file.

The only chance for surprise is when a user expects a non-local name to be found. In the rare case we have a local match, the generated code won't be what the user expects. That error shouldn't get past the static_assert. But it might just immediately break evaluating the next metafunction in a chain.

JohelEGP commented 6 months ago

I need the metafunction symbols exported by the libraries. I will fall back to using Boost.DLL to prototype the solution. It would be painful to use the C interfaces: https://stackoverflow.com/a/2694373.

Or maybe I should just go ahead and start emitting the extra semantic information (https://github.com/hsutter/cppfront/issues/909#issuecomment-1871286126).

DyXel commented 6 months ago

Or maybe I should just go ahead and start emitting the extra semantic information (#909 (comment)).

I think having a specific C function per TO/DLL that is able to tell whether or not it has the symbol, has value on its own, for example:

CPP2_C_API int cpp2_meta_library_has_metafunction(const char* name, size_t size) {
    static std::set<std::string_view> mfs = {"greeter", /*...*/};
    return mfs.count(std::string_view{name, size});
}

(...or alternatively, a function that gives you a list of strings from which you can build a look-up table)

Armed with a function like this you would be able to tell what was exported, but also, you could first check the existence of this same function before proceeding with anything else, granting the opportunity to give the user a good explanatory message, like "cpp2_meta_library_has_metafunction was not found in DLL 'x', are you sure 'x' is a cppfront meta library?".

Just my 2 cents though.

JohelEGP commented 6 months ago

Great idea, thank you!

JohelEGP commented 6 months ago

but I wonder, should that functionality be covered as you mention with a metafunction (c_api)? I didn't think of using them like that, feels like that is not what metafunctions are intended for

I don't disagree. But it's currently the most fitting place to specify properties for the declaration. I also want a @deleted instead of having to unsatisfactorily abuse a private overload (see the thread starting at https://github.com/hsutter/cppfront/issues/468#issuecomment-1627565647 and the referencing issues).

And potentially @all_freestanding, @freestanding, @freestanding_deleted, and @hosted (whose effect depend on __STDC_HOSTED__). Although those have usability limitations (it might not really be what we want):

More generally, we still need a replacement for some uses of the preprocessor. Maybe Cpp1 reflection will help here.

DyXel commented 6 months ago

I don't disagree. But it's currently the most fitting place to specify properties for the declaration.

Yeah, for a POC is fine. I do want people trying this functionality to its maximum and give good feedback to Herb, but I do worry about its future, been thinking about this for a while now so I might as well share my opinion:

A user-defined metafunction right now can do way too much, after all, it is arbitrary Cpp1 code being compiled and executed. It was fine being only used internally by the cppfront compiler, but if we give users total freedom to do whatever they want within a metafunction (e.g. execute arbitrary code and generate side-effects) then it'll become a problem once cpp1 reflection/generation lands, assuming it would be a subset of what cppfront offers, there would be competition between what cppfront can do and what was standardized ("teach a man how to fish...").

Another thing that I don't like is the necessary double-pass introduced by this user-defined meta functionality, in order to use a user-defined metafunction you'd need to write Cpp2, which gets lowered to Cpp1, which then gets compiled, and then used somewhere else in Cpp2, this extends the usage from simple transpiler (cpp2 -> cpp1 -> compile and run!) to something more complicated (cpp2 -> cpp1 -> compile meta -> cpp2 -> cpp1 -> compile and run!), which might drive adoption away.

For the latter, I think in a perfect world, cppfront would be able to interpret and apply the metafunction itself without having to loopback, and then when cpp1 meta lands (hopefully a good implementation with feedback received from this experiment!), we start generating cpp1 code instead of interpreting, and so users would be minimally affected.

JohelEGP commented 6 months ago

then it'll become a problem once cpp1 reflection/generation lands, assuming it would be a subset of what cppfront offers, there would be competition between what cppfront can do and what was standardized

For C++26 (P2996), the feature sets would be disjoint. In the future, we should have metafunctions (meta classes?) in Cpp1. That should subsume Cpp2 metafunctions, and we should migrate to using that Cpp1 feature.

JohelEGP commented 6 months ago

Or maybe I should just go ahead and start emitting the extra semantic information (#909 (comment)).

I think having a specific C function per TO/DLL that is able to tell whether or not it has the symbol, has value on its own, for example:

CPP2_C_API int cpp2_meta_library_has_metafunction(const char* name, size_t size)

Now I need this function to have a unique name per Cpp2 source with metafunctions.

A library can be composed of multiple source files. So I can't use the library path for the unique name. When loading the library, I can't map back to its source files.

I could use the absolute path of the Cppx source file. But without the abstraction to loop over a DLL's symbols, on top of needing the library path when loading a metafunction, I would also need the Cppx source files that make it up. The source of this information would be the caller of cppfront.

Emitting extra semantic information would be cleaner and forward-looking (https://github.com/hsutter/cppfront/issues/909#issuecomment-1871286126). Of course, that still requires the caller of cppfront to give us a file and to forward that file to the compilation of dependent Cpp2 source files.

JohelEGP commented 6 months ago

For this sanity check to be viable in a non-modules world, metafunctions need to be authored in a pure Cpp2 .h2 header. The implementing .hpp header needs to be #included in a library's source file. A metafunction's @-use requires its .h2 header to be #included and its implementing library to be linked to. See https://github.com/hsutter/cppfront/issues/594#issuecomment-1793627053 for details on this .h2 header usage. I would also need to emit the loadable symbol in Phase 2 "Cpp2 type definitions and function declarations".

JohelEGP commented 6 months ago

With regards to https://github.com/hsutter/cppfront/pull/907#issuecomment-1872644205. There's another way to support multi-source libraries/C++ modules with TMP. The function that returns the list of symbols would be named the same on all sources but be a template dependent on a compile-time counter. The source compiled with CPPFRONT_METAFUNCTION_LIBRARY (the Cpp2 source of the library which includes the implementing .hpp headers, and the module interface unit for a C++ module) would have the uniquely-named symbol that returns the aggregated list.

JohelEGP commented 6 months ago

@hsutter This is ready for review.

JohelEGP commented 5 months ago

This is what might be able to remove the need for CPPFRONT_METAFUNCTION_LIBRARY (https://github.com/hsutter/cppfront/issues/909#issuecomment-1885005866):

IIUC, that inverts the logic so that plugins register themselves, right?

Yes, mostly. The application still needs to know that libraries to load but this process is just reduced to system calls to load the library and find one "C" function with a known name.

DyXel commented 5 months ago

This is what might be able to remove the need for CPPFRONT_METAFUNCTION_LIBRARY (#909 (comment)):

IIUC, that inverts the logic so that plugins register themselves, right?

Yes, mostly. The application still needs to know that libraries to load but this process is just reduced to system calls to load the library and find one "C" function with a known name.

https://github.com/hsutter/cppfront/pull/907#issuecomment-1871737914 I briefly mentioned something similar here.

Thinking about it, what you'd need is a static object for which you can register the metafunctions automatically when the DLL is loaded, then you can have a per-DLL function with that unique name that gives you back the mapping between a name and the actual metafunction. it would also be a good place (needed even?) for a "teardown", as we discussed.

DyXel commented 5 months ago

Regarding the above compiler flags:

DyXel commented 5 months ago

By the way, as a side-note: Could we please change the envvar names a little? I got them mixed up during our call (sorry for that), though the conclusion was correct, I think they are too similar:

CPPFRONT_METAFUNCTION_LIBRARY CPPFRONT_METAFUNCTION_LIBRARIES

But I don't know what to suggest, maybe CPPFRONT_META_LIB_NAME and CPPFRONT_META_LOAD_LIBS?

MaxSagebaum commented 5 months ago

I also checked the workflow on my machine.

There are two options to address this issue:

  1. cpp2_metafunction_get_symbol_names_ is no longer generated automatically. The user has to add it manually. This might be addressed with a metafunction @create_meta_export(func1, func2, func3,...). But this would require to have "free flow" meta functions. The code would be:
    greeter: (inout t: cpp2::meta::type_declaration) = {
    t.add_member($R"(say_hi: () = std::cout << "Hello, world!\nFrom (t.name())$\n";)");
    }
    @create_meta_export(greeter)
  2. We use the static initialization to register the metafunction automagically. cppfront could export a symbol e.g. add_metafunction and then for each metafunction a static registration method is created, e.g.
    static const int init_greeter = ::cpp2::meta::add_metafunction(greeter, "greeter");

It think I would prefer option 1.

Here is a patch that removes CPPFRONT_METAFUNCTION_LIBRARY: 0001-Remove-CPPFRONT_METAFUNCTION_LIBRARY-requirement.txt

On my machine the reduced layout creating and using a metafunction is now:

# Compiling cppfront
  g++ -rdynamic cppfront.cpp -o cppfront

# Creating the library
  ./cppfront metafunctions.cpp2
  g++ -std=c++20 -fPIC -shared -o libmetafunctions.so metafunctions.cpp

# Using the library
  CPPFRONT_METAFUNCTION_LIBRARIES=./libmetafunctions.so ./cppfront main.cpp2
  g++ -std=c++20 main.cpp -o main
  ./main
DyXel commented 5 months ago

I think we should at least attempt to follow the auto-registering mechanism that most test framework out there have (and what I hacked very briefly on my test_metafunction branch), that would be Option 2. In fact, I would go as far as saying that maybe the compiler should have some kind of generic infrastructure to aid this "pattern"? Because this can also be extended not just to test frameworks, but to this metafunction registration/look-up problem, registry of bindings for other languages and probably more. Essentially, you write your code as normal, auto registration is generated on a per-TU basis, and then you need a way to signal the generation of a single and unambiguous function per shared object/program (as opposed to per TU), and we'd want the equivalent as well for tear-down.

EDIT: Note: I completely side-stepped this issue entirely on my test_metafunction branch by piggybacking on the fact that there must be only 1 main function in the program--I simply inject the "run test framework" in that function. Maybe we could have something similar for metafunctions?

hsutter commented 5 months ago

I think we should at least attempt to follow the auto-registering mechanism that most test framework out there have (and what I hacked very briefly on my test_metafunction branch), that would be Option 2. In fact, I would go as far as saying that maybe the compiler should have some kind of generic infrastructure to aid this "pattern"?

Is this similar to what I suggested in the Note in this 797 comment?

DyXel commented 5 months ago

Is this similar to what I suggested in the Note in this 797 comment?

If you mean

Future metafunction generation capabilities. We do (eventually) want metafunctions to be able to generate new declarations into existing scopes in the parse tree that are outside the type the metafunction is being applied to

Then yeah, it would be exactly that.

hsutter commented 5 months ago

Sorry, I meant this part (I didn't repaste it here because I didn't want to lose the link to Johel's followup comments about feasibility).

(for example, as a strawman: and have the first pass script invoke cppfront with a new flag that says to only compile functions that have a meta API declaration as a parameter; or require user-defined metafunctions to have names that start with @ and have the first pass script invoke cppfront with a new flag that says to only compile functions with a name that starts with @; or something else).

DyXel commented 5 months ago

Ah, I think I see what you mean. Let's grok this in 2 parts:

On the double-pass for a single file: I don't think this is can be made full solution currently, as soon as you consider multiple files, everything breaks apart;

Marking metafunctions: I don't think that marking/annotating them is strictly necessary (after all, currently checked-in code simply detects the signature just fine¹), Personally I would argue in favor of actually marking them because:

¹ a bit limited but overall seems decent.

Side note: Maybe just me, but its getting harder and harder to keep track of everything that has been discussed, specially since since stuff is getting so meta 😅 I will try to make a post at a later date collecting all the problems and some potential solutions to them in a single post, because this is getting unruly--jumping to several different places to get all the context is hard.

DyXel commented 4 months ago

In this post I compile all current "constraints" that I know of, that this specific solution has, as currently implemented in this PR.

Please let me know of any developments so I can directly update this post, and let us enumerate/name these constraints so we can more easily refer to them later.

Constraints

Constraint Nº1: Distinct compilation step required for metafunctions

Currently, we need to manually identify that we are building a metafunction library as opposed to a regular C++ program. I think it comes with the design of the solution, but having to remember extra steps is inconvenient from a User Experience (UX) standpoint. As I mentioned before in a different post: This extends the usage of cppfront from simple transpiler (`Cpp2` -> `Cpp1` -> `compile and run!`) to something more involved *when authoring metafunctions or using user-defined ones* (`Cpp2` -> `Cpp1` -> `compile meta` -> `Cpp2` -> `Cpp1` -> `compile and run!`), which may cause potential adopters to lose interest.

Constraint Nº2: Inability to apply metafunctions defined in the same TU or "step"

A variation of the classic chicken-and-egg problem, bootstrapping (compilers), etc. Essentially, in order to have a metafunction available for use, you must have first compiled and loaded it as a DLL; Therefore you can't both define a metafunction and use it in the same "step" in your compilation process. In a similar vein, if the user tangles themselves enough, they could be lead to circular dependencies between regular code and metafunction code, breaking causality. Consider this trivial code for example: ```cpp foo: @bar type = { } // Depends on @bar bar: (inout t: cpp2::meta::type_declaration) = { a: foo = (); // Depends on foo _ = a; } // Which came first: foo or bar? ``` Currently, by having a strict separation of metafunction code and regular code, the circular dependency can be avoided somewhat, but it is something to keep in mind if we want to implement a workaround.

Constraint Nº3: Multiple symbols with the same name

This one I would say spans three distinct levels: - a. Name collisions in different files which are `#include`'d together: Adding for completeness, I don't think this one is a big deal, as it should be caught entirely by Cpp1 during compilation. Is there any potential edge-cases to consider? - b. Name collisions in distinct TUs, but in a singular program/DLL: These violate ODR and do not require diagnostic, they could be potentially insidious to detect (though most linkers do diagnose it). - c. Fully-qualified names in distinct DLLs, but loaded in a single invocation of cppfront: There must be a way to identify these in distinct DLLs that are loaded simultaneously and either reject them or disambiguate, as currently the first match is picked silently. In my opinion, we should treat metafunction definitions just like C++ treats them: It is a violation to have multiple definitions with the same fully-qualified name (and going further for our use-case, even across different loaded DLLs). In cppfront we should actually catch these violations and report them to the user when possible.

Constraint Nº4: Need for an entry-point for the DLL ...

... and the need for CPPFRONT_METAFUNCTION_LIBRARY

The main reason (I believe?) why `CPPFRONT_METAFUNCTION_LIBRARY` is needed right now. There must be a way to identify exported C names in distinct DLLs for "entry-point"s that are loaded simultaneously. As pointed out by @/MaxSagebaum, you can have multiple DLLs with the same name and different definitions, they shouldn't collide if you are manually loading a DLL via `dlopen` or `LoadLibrary`. Further, with the current approach, you'd need to define `CPPFRONT_METAFUNCTION_LIBRARY` for each file/TU in order to get a distinct name that won't have collisions when linking the entire DLL, this might not be viable at scale.

Constraint Nº5: Names are "C-namespaced" and name look-up is limited

There's a limitation in name look-up explained [here](https://github.com/hsutter/cppfront/pull/907/files#diff-09d210b8aa3d8090c8877ce7afc954f7e8affd2de2e0546bf0edbb7a9f7ba98eR116-R118). It is marked as a temporary alpha limitation so I expect this to be solvable with extra effort. TODO: Test how limited it is/put example code showcasing the limitation.

Constraint Nº6: There's no concept of "construction" or "tear-down" for generation

Currently, there's no specification or guarantees to the user or metafunction author on how their libraries are loaded, therefore, it is impossible to determine exactly when static constructors and destructors are going to be called within the library, and thus, cannot be relied upon for things like setting up resources or closing down out-of-file generation states. [Static local variables](https://en.cppreference.com/w/cpp/language/storage_duration#Static_local_variables) can be used as reference starting point. `dlopen`/`LoadLibrary` and `dlclose`/`CloseLibrary` pairs seem to execute static constructors and destructors properly, so it could be possible to give the users the guarantees that global static objects will be constructed/destructed on a per file (TU) basis. Later, if cppfront is able to handle entire program compilations by itself, an additional guarantee could be given (no idea if its actually a good idea): * Global static objects will be constructed/destructed on a per program basis, that is, constructors will run before processing any file, and destructors will run after processing all files but before generating the final program.

Constraint Nº7: The mechanism for out-of-class generation is hard-coded to metafunctions

In order to generate a unique entry-point to load the library among other things, out-of-class/out-of-function generation is done, however, this process is not generic/reusable for metafunction authors. Ideally the whole mechanism used right now should be refactored such that it can both be used by this implementation and by users alike.
DyXel commented 4 months ago

I have drafted https://github.com/JohelEGP/cppfront/pull/2 in order to possibly tackle Constraint Nº3, Constraint Nº4 and Constraint Nº5.

JohelEGP commented 4 months ago

@DyXel To reply to your email, feel free to take over as you see fit. I haven't been programming lately, and I don't know when I'll be back.

DyXel commented 4 months ago

Hey, thanks for your response.

I haven't been programming lately, and I don't know when I'll be back.

I have been there as well, hope to see you active again later!

feel free to take over as you see fit.

Will ponder about it and see where that leads me. Cheers!

DyXel commented 4 months ago

I think I have thought about it for long enough, and I've decided to take over this feature; As said in my mail to Johel, my agenda is to simplify this solution by means of re-implementing and/or re-factoring the current work, and to solve several of the constraints mentioned in my comment above while doing so.

Here's my current plan, re-using some (most?) of the work in this PR:

If there are no objections to this roadmap, I would start working on it next weekend. @MaxSagebaum do I have your permission to reuse your work from https://github.com/hsutter/cppfront/pull/809 if needed?

MaxSagebaum commented 3 months ago

@DyXel yes sure. You have my permission.

Your plan sound good to me.