peter-winter / ctpg

Compile Time Parser Generator is a C++ single header library which takes a language description as a C++ code and turns it into a LR1 table parser with a deterministic finite automaton lexical analyzer, all in compile time.
MIT License
456 stars 24 forks source link

[question] Trouble Making Argument Lists #47

Closed cgbsu closed 2 years ago

cgbsu commented 2 years ago

Hello,

I don't know if this is an appropriate forum for this but I am kinda at the end of my rope here. Im at the end of the semester and need to get this in, I have been struggling for the last few days to implement function calls into my language.

I want them to be "factors" and usable in mathematical expressions (functions are just simple arithmetic expressions in my language with, argument lists (which have comparisons and logical operators).

The way I get factors right now is something like this:

////////////////////////////////  "Literals" ////////////////////////////////

non_terminal_term< Factor >( term< FunctionResult > )  //`@` is a special variable used for a contracts, nothing really to do with call syntax. //
        >= []( auto token ) {
                return std::move( Warp::Utilities::allocate_node< FunctionResult >() );
        }, 
non_terminal_term< Factor >( term< Identifier > ) 
        >= []( auto token ) {
                return std::move( Warp::Utilities::allocate_node< 
                                Warp::AbstractSyntaxTree::NodeType::Identifier 
                        >( Warp::Utilities::hash_string( token ) ) );
        }, 
non_terminal_term< Factor >( term< NaturalNumber > ) 
        >= []( auto token )
        {
                return std::move( Warp::Utilities::allocate_integral_literal_node< 
                        ResolvedType< NaturalNumber > 
                    >( token ) );
        }, 
////////////////////////////////  Multiplication/Division ////////////////////////////////

non_terminal_term< Factor >( non_terminal_term< Factor >, term< FactorMultiply >, term< NaturalNumber > ) 
        >= []( auto current_factor, auto, const auto& next_token )
            {
                return Warp::Utilities::allocate_node< FactorMultiply >( 
                        current_factor, 
                        Warp::Utilities::allocate_integral_literal_node< 
                                ResolvedType< NaturalNumber > 
                            >( next_token )
                    );
            }, 
non_terminal_term< Factor >( non_terminal_term< Factor >, term< FactorDivide >, term< NaturalNumber > ) 
        >= []( auto current_factor, auto, const auto& next_token )
            {
                return Warp::Utilities::allocate_node< FactorDivide >( 
                        current_factor, 
                        Warp::Utilities::allocate_integral_literal_node< 
                                ResolvedType< NaturalNumber > 
                            >( next_token )
                    );
            }, 
non_terminal_term< Factor >( non_terminal_term< Factor >, term< FactorMultiply >, term< Identifier > ) 
        >= []( auto current_factor, auto, const auto& next_token )
            {
                return Warp::Utilities::allocate_node< FactorMultiply >( 
                        current_factor, 
                        Warp::Utilities::allocate_node< 
                                Warp::AbstractSyntaxTree::NodeType::Identifier 
                            >( Warp::Utilities::hash_string( next_token ) )
                    );
            }, 
non_terminal_term< Factor >( non_terminal_term< Factor >, term< FactorDivide >, term< Identifier > ) 
        >= []( auto current_factor, auto, const auto& next_token )
            {
                return Warp::Utilities::allocate_node< FactorDivide >( 
                        current_factor, 
                        Warp::Utilities::allocate_node< 
                                Warp::AbstractSyntaxTree::NodeType::Identifier 
                            >( Warp::Utilities::hash_string( next_token ) )
                    );
            }, 

////////////////////////////////  Parenthesis ////////////////////////////////

non_terminal_term< ParenthesisScope >( term< OpenParenthesis >, non_terminal_term< Factor >, term< CloseParenthesis > )
        >= [] ( auto, auto factor, auto ) { return factor; }, 
non_terminal_term< Factor >( non_terminal_term< ParenthesisScope > ) 
        >= []( auto parenthesis_scope ) { return parenthesis_scope; }, 
non_terminal_term< Factor >( non_terminal_term< Factor >, term< FactorMultiply >, non_terminal_term< ParenthesisScope > ) 
        >= []( auto factor, auto, auto parenthesis_scope ) 
            {
                return Warp::Utilities::allocate_node< FactorMultiply >( 
                        factor, 
                        parenthesis_scope 
                    );
            }, 
non_terminal_term< Factor >( non_terminal_term< Factor >, term< FactorDivide >, non_terminal_term< ParenthesisScope > ) 
        >= []( auto factor, auto, auto parenthesis_scope ) 
            {
                return Warp::Utilities::allocate_node< FactorDivide >( 
                        factor, 
                        parenthesis_scope 
                    );
            }, 
////////////////////////////////  Addition/Subtraction ////////////////////////////////
non_terminal_term< Factor >( non_terminal_term< Sum > ) 
        >= []( auto sum ) {
            return sum;
        }, 
non_terminal_term< Sum >( non_terminal_term< Factor >, term< SumAdd >, non_terminal_term< Factor > ) 
        >= []( auto current_sum, auto, const auto& next_token ) 
            {
                return Warp::Utilities::allocate_node< SumAdd >( 
                        current_sum, 
                        next_token 
                    );
            }, 
non_terminal_term< Sum >( non_terminal_term< Factor >, term< SumSubtract >, non_terminal_term< Factor > ) 
        >= []( auto current_sum, auto, const auto& next_token )
            {
                return Warp::Utilities::allocate_node< SumSubtract >( 
                        current_sum, 
                        next_token 
                    );
            }, 
non_terminal_term< Expression >( non_terminal_term< Factor > ) 
        >= []( auto factor ) {
            return factor;
        }, 
//...

If I am just trying to reduce factors the following works perfectly


non_terminal_term< Call >( term< Identifier >, term< OpenParenthesis > ) 
        >= []( auto identifier, auto ) // function, auto identifier, auto )
            {
                std::cout << "Call found!\n";
                return Warp::CompilerRuntime::CallType{ 
                        Warp::Utilities::hash_string( identifier ), 
                        Warp::Utilities::VectorType< Warp::AbstractSyntaxTree::NodeVariantType >{} 
                    };
            }, 
non_terminal_term< Call >( non_terminal_term< Call >, non_terminal_term< Factor >, term< FunctionParameterNextParameter > ) 
        >= [](auto call, auto argument, auto )
            {
                std::cout << "Argument buffered!\n";
                return Warp::CompilerRuntime::CallType{ 
                        call.identifier, 
                        Warp::Utilities::VectorType< Warp::AbstractSyntaxTree::NodeVariantType >{ 
                                call.arguments, 
                                argument 
                            } 
                    };
            }, 
non_terminal_term< Factor >( non_terminal_term< Call >, non_terminal_term< Factor >, term< CloseParenthesis > ) 
        >= []( auto call, auto argument, auto )
            {
                std::cout << "Call end 0!\n";
                return Warp::Utilities::allocate_node< Warp::AbstractSyntaxTree::NodeType::FunctionCall >( 
                        call.identifier, 
                        Warp::Utilities::VectorType< Warp::AbstractSyntaxTree::NodeVariantType >{ 
                                call.arguments, 
                                argument 
                            } 
                    );
            }, 
non_terminal_term< Factor >( non_terminal_term< Call >, term< CloseParenthesis > ) 
        >= []( auto call, auto )
            {
                std::cout << "Call end 1!\n";
                return Warp::Utilities::allocate_node< Warp::AbstractSyntaxTree::NodeType::FunctionCall >( 
                        call.identifier, 
                        call.arguments 
                    );
            }

Here are the test cases

static TestSuiteType factor_calls{
        WarpTest{ true, "test( 1 )" }, 
        WarpTest{ true, "test( a )" }, 
        WarpTest{ true, "20 * test( a )" }, 
        WarpTest{ true, "20 + test( a )" }, 
        WarpTest{ true, "test( a ) * 20" }, 
        WarpTest{ true, "test( a ) + 10" }, 
        WarpTest{ true, "test( a * a )" }, 
        WarpTest{ true, "test( a(), b( q, 4, 5646, 345345 * 445656 + 34 ), rdfg * 34534 )" }, 
        WarpTest{ true, "test( a, q, 4, 5646, 345345 * 445656 + 34, rdfg * 34534 )" }, 
        WarpTest{ true, "test( ttt(), q, 4, 5646, 345345 * 445656 + 34, rdfg * 34534 )" } 
    };

But when I try to put it in the context of reducing a function the compiler seems to get into an infinite loop and wont compile the parser. To get an idea of the test cases I am working with for a whole function:

static TestSuiteType function_alternative_calls{
        WarpTest{ true, "let test( a : a < 64 ) :: test( 1 );" }, 
        WarpTest{ true, "let test( a : a < 64 ) :: test( a );" }, 
        WarpTest{ true, "let test( a : a < 64 ) :: 20 * test( a );" }, 
        WarpTest{ true, "let test( a : a < 64 ) :: 20 + test( a );" }, 
        WarpTest{ true, "let test( a : a < 64 ) :: test( a ) * 20;" }, 
        WarpTest{ true, "let test( a : a < 64 ) :: test( a ) + 10;" }, 
        WarpTest{ true, "let test( a : a < 64 ) :: test( a * a );" }, 
        WarpTest{ true, "let test( a : a < 64 ) :: test( a(), b( q, 4, 5646, 345345 * 445656 + 34 ), rdfg * 34534 );" }, 
        WarpTest{ true, "let test( a : a < 64 ) :: test( a, q, 4, 5646, 345345 * 445656 + 34, rdfg * 34534 );" }, 
        WarpTest{ true, "let test( a : a < 64 ) :: test( ttt(), q, 4, 5646, 345345 * 445656 + 34, rdfg * 34534 );" } 
    };

Its crucial I make this work so my language can be Turing complete and I can use recursion.

I am able to get all these tests with the exception of 3, here are those results:

Pass: let test( a : a < 64 ) :: test( 1 );
Pass: let test( a : a < 64 ) :: test( a );
[1:36] PARSE: Syntax error: Unexpected '('
Fail: let test( a : a < 64 ) :: 20 * test( a );
[1:38] PARSE: Syntax error: Unexpected 'Identifier'
Fail: let test( a : a < 64 ) :: 20 + test( a );
Pass: let test( a : a < 64 ) :: test( a ) * 20;
[1:37] PARSE: Syntax error: Unexpected '+'
Fail: let test( a : a < 64 ) :: test( a ) + 10;
Pass: let test( a : a < 64 ) :: test( a * a );
Pass: let test( a : a < 64 ) :: test( a(), b( q, 4, 5646, 345345 * 445656 + 34 ), rdfg * 34534 );
Pass: let test( a : a < 64 ) :: test( a, q, 4, 5646, 345345 * 445656 + 34, rdfg * 34534 );
Pass: let test( a : a < 64 ) :: test( ttt(), q, 4, 5646, 345345 * 445656 + 34, rdfg * 34534 );

Here is the code that makes that work:

//... code that analyzes function parameters tries to buffer the expression after the `::` operator
non_terminal_term< ExpressionEater >( non_terminal_term< ExpressionEater >, term< FactorMultiply >, non_terminal_term< Expression > ) 
        >= []( auto function, auto, auto factor ) {
                return subsume_function_alternative_expression< FactorMultiply >( function, factor );
            }, 
non_terminal_term< ExpressionEater >( non_terminal_term< ExpressionEater >, term< FactorDivide >, non_terminal_term< Expression > ) 
        >= []( auto function, auto, auto factor ) {
                return subsume_function_alternative_expression< FactorDivide >( function, factor );
            }, 
non_terminal_term< Call >( term< Identifier >, term< OpenParenthesis > ) 
        >= []( auto identifier, auto ) 
            {
                return Warp::CompilerRuntime::CallType{ 
                        Warp::Utilities::hash_string( identifier ), 
                        std::vector< Warp::AbstractSyntaxTree::NodeVariantType >{}
                        //Warp::Utilities::VectorType< Warp::AbstractSyntaxTree::NodeVariantType >{} 
                    };
            }, 
non_terminal_term< Call >( non_terminal_term< Call >, non_terminal_term< Factor >, term< FunctionParameterNextParameter > ) 
        >= [](auto call, auto argument, auto ) {
                call.arguments.push_back( argument );
                return call;
            }, 
non_terminal_term< CallNode >( non_terminal_term< Call >, non_terminal_term< Factor >, term< CloseParenthesis > ) 
        >= []( auto call, auto argument, auto )
            {
                call.arguments.push_back( argument );
                return Warp::Utilities::allocate_node< Warp::AbstractSyntaxTree::NodeType::FunctionCall >( 
                        call.identifier, 
                        call.arguments 
                    );
            }, 
non_terminal_term< Call >( non_terminal_term< Call >, non_terminal_term< CallNode >, term< FunctionParameterNextParameter > ) 
        >= [](auto call, auto argument, auto ) {
                call.arguments.push_back( argument );
                return call;
            }, 
non_terminal_term< CallNode >( non_terminal_term< Call >, non_terminal_term< CallNode >, term< CloseParenthesis > ) 
        >= []( auto call, auto argument, auto )
            {
                return Warp::Utilities::allocate_node< Warp::AbstractSyntaxTree::NodeType::FunctionCall >( 
                        call.identifier, 
                        call.arguments 
                    );
            }, 
non_terminal_term< CallNode >( non_terminal_term< Call >, term< CloseParenthesis > ) 
        >= []( auto call, auto )
            {
                return Warp::Utilities::allocate_node< Warp::AbstractSyntaxTree::NodeType::FunctionCall >( 
                        call.identifier, 
                        call.arguments 
                    );
            }, 
non_terminal_term< Expression >( non_terminal_term< CallNode > )
        >= []( auto call ) {
                return call;
            }

As you may be able to see this is a bit of a "hack" partially re-implementing factors towards the top there. It cant do sums and it cant take into account any products before the function call.

I have tried everything you can see some of the changes I have made in detail on this branch but to summarize, the main problem is though if I try to make this "complete" in any sense where it loops back to a factor or its equivalent and treats the call like a factor (as I want it too) the compiler infinitely loops, even though it works just fine when reducing just a factor. It doesent seem to matter where I put the terminal terms, non-terminal-terms, etc.

One thing I have also tried is never "leaving" factor evaluation.

non_terminal_term< Factor >( non_terminal_term< Factor >, term< FactorMultiply >, term< Identifier >, term< OpenParenthesis > ) 
        >= []( auto factor, auto, auto identifier, auto ) 
            {
                return Warp::Utilities::allocate_node< FactorMultiply >( 
                        factor, 
                        Warp::Utilities::allocate_node< Warp::AbstractSyntaxTree::NodeType::FunctionCall >( 
                                Warp::Utilities::hash_string( identifier ), 
                                std::vector< Warp::AbstractSyntaxTree::NodeVariantType >{} 
                                // Warp::Utilities::VectorType< Warp::AbstractSyntaxTree::NodeVariantType >{} 
                            ) 
                    );
            }, 
non_terminal_term< Factor >( non_terminal_term< Factor >, non_terminal_term< Factor >, term< FunctionParameterNextParameter > ) 
        >= []( auto operation_call, auto argument, auto )
            {
                if( Warp::Utilities::is_bi_node( operation_call ) == false ) 
                {
                    std::cerr << "ERROR!!!: Attempt to parse function call failed! "
                            << "Bi-Node not found where expected (Factor, Factor, NextParmaeterToken(Comma))! "
                            << "Returning first factor.\n";
                }
                auto proxy = Warp::Utilities::bi_node_proxy( operation_call ).value();
                if( Warp::Utilities::tag_is< Warp::AbstractSyntaxTree::NodeType::FunctionCall >( proxy.right ) == true )
                {
                    std::cerr << "ERROR!!!: Attempt to parse function call failed! "
                            << "FunctionCall not found where expected (Factor (left: Factor, right: FunctionCall), "
                            << "Factor, NextParmaeterToken(Comma))! Returning first factor.\n";
                }
                auto call = static_cast< Warp::CompilerRuntime::CallType* >( proxy.right->get_data() );
                call->arguments.push_back( argument );
                return operation_call;
            }, 
non_terminal_term< Factor >( non_terminal_term< Factor >, non_terminal_term< Factor >, term< CloseParenthesis > ) 
        >= []( auto operation_call, auto argument, auto )
            {
                if( Warp::Utilities::is_bi_node( operation_call ) == false ) 
                {
                    std::cerr << "ERROR!!!: Attempt to parse function call failed! "
                            << "Bi-Node not found where expected (Factor, Factor, CloseParenthesis)! "
                            << "Returning first factor.\n";
                }
                auto proxy = Warp::Utilities::bi_node_proxy( operation_call ).value();
                if( Warp::Utilities::tag_is< Warp::AbstractSyntaxTree::NodeType::FunctionCall >( proxy.right ) == true )
                {
                    std::cerr << "ERROR!!!: Attempt to parse function call failed! "
                            << "FunctionCall not found where expected (Factor (left: Factor, right: FunctionCall), "
                            << "Factor, CloseParenthesis)! Returning first factor.\n";
                }
                auto call = static_cast< Warp::CompilerRuntime::CallType* >( proxy.right->get_data() );
                call->arguments.push_back( argument );
                return operation_call;
            } 

Unfortunately its the same result :(

An attempt at some pseudo - BNF's I have tried

call := 
            | <identifier > (
            | <call> <factor> , 
            | <complete-call> , 
complete-call = 
            | <call> < factor > )
            | <call> )
factor/expression = complete-call
call := 
            | <identifier > (
            | <call> <argument-list> )
            | <call> )
argument-list = 
            | <factor> ,
            | <argument-list> <factor>, 
factor := 
            | <factor> * <literal>
            | <factor> / <literal>
            | <factor> * <identifier> (
            | <factor>, 
            | <factor> )

All these in various permutations and versions (again you can look at my changes on this branch

Again I don't know if this is an appropriate forum I apologize if its not, any help would really be appreciated, I'm really under the wire.

cgbsu commented 2 years ago

Turns out I needed to increase the constexpr ops depth -fconstexpr-ops-limit=3355443299 in gcc 😓