hsutter / cppfront

A personal experimental C++ Syntax 2 -> Syntax 1 compiler
Other
5.39k stars 232 forks source link

fix: recognize last uses (including of `this`) #887

Closed JohelEGP closed 6 months ago

JohelEGP commented 8 months ago

Resolves #890. Resolves #888. Resolves #884. Resolves #869. Resolves #857. Resolves #850. Resolves #847. Resolves #832. Resolves #825. Resolves #313. Resolves https://github.com/hsutter/cppfront/commit/13f765ba7927478162318325eff55b2a01c3da81#commitcomment-133825753. Resolves https://github.com/hsutter/cppfront/commit/99d66f0f3937d4c4e4d94756a773d398071f93b6#diff-b45bd40d0ae8fbf75e5dcf83e067c0f9a715d4f46e05ffe839d487878f0dcc44R31-R35.

Testing summary:

100% tests passed, 0 tests failed out of 898

Total Test time (real) =  69.23 sec

Acknowledgements:

JohelEGP commented 8 months ago

Feel free to review. All that's left is #869. (Not anymore since commit 5b6e1fd0f50aec8f4b81a707210e48cf7285a5b1).

JohelEGP commented 8 months ago

Resolved with commit 7f4a60f03315dfe2968ce131249bba930dc762cd.

Maybe my mental model was wrong and the implicit else branch isn't "the rest of the function". Compiling reflect.h2 results in

reflect.h2...
reflect.h2(862,5): error: local variable underlying_type must be initialized on both branches or neither branch
reflect.h2(865,5): error: "if" initializes underlying_type on:
  implicit else branch
but not on:
  branch starting at line 865
  ==> program violates initialization safety guarantee - see previous errors

for

/*862*/    underlying_type : std::string;
/*863*/
/*864*/    t.reserve_names( "operator=", "operator<=>" );
/*865*/    if bitwise {
/*866*/        t.reserve_names( "has", "set", "clear", "to_string", "get_raw_value", "none" );
/*867*/    }
/*868*/
/*869*/    //  1. Gather: The names of all the user-written members, and find/compute the type
/*870*/
/*871*/    underlying_type = t.get_argument(0);    // use the first template argument, if there was one

Is this error message legit or a bug?

JohelEGP commented 8 months ago

Resolved with commit 7f4a60f03315dfe2968ce131249bba930dc762cd.

Maybe my mental model was wrong and the implicit else branch isn't "the rest of the function".

The error message is a bug on my implementation.

440's expectations are still correct.

It seems like the problem isn't with my mental model on the else branch. Instead, it seems like an implicit else branch has an implicit main selection statement. That statement isn't the if that introduces the implicit else branch, but all statements at the level of the if up to an opening {.

Both reflect.h2's code and #440's example would continue working with such a model.

JohelEGP commented 8 months ago

This really hits on generation times. Is it the use of std::map? A debug configuration of cppfront from branch main compiles reflect.h2 in 0.488s, whereas it takes 2.762s with this PR. I have also noticed the most heavy tests taking from 0.13 s to 0.48 s to generate.

JohelEGP commented 8 months ago

Resolved with commit 7f4a60f03315dfe2968ce131249bba930dc762cd.

Maybe my mental model was wrong and the implicit else branch isn't "the rest of the function".

Yeah. The rest of the function is executed, unlike an else branch.

gregmarr commented 8 months ago

Is it the use of std::map? A debug configuration of cppfront from branch main compiles reflect.h2 in 0.488s, whereas it takes 2.762s with this PR.

Brainstorming: Containers do tend to be slower in debug, have you compared release? Maybe try unordered_map? How big does this map get? Could you use a vector? Does all the population get done before you start lookup? If so, you could populate, sort, and do binary search for lookup. Could you put a final_position (looks like this means "sema visit order") in token rather than having to store it externally? It would have to be mutable since it's taken by const &.

hsutter commented 8 months ago

Thanks for this. That's a lot of issues being tackled!

I've taken a pass through the PR code, and my main question so far is: Can you elaborate why the statements following an if statement with an implicit else branch are being added to that previously-empty implicit else branch node? Maybe an example of code that this representation makes easier to handle would help me understand.

JohelEGP commented 8 months ago

The approach, started with commit e9cc033b36e47de783a3c4ae12174d66fc592fa6, fixes #440 without introducing apparent regressions.

I'm not sure if it makes anything easier to handle. In fact, I now have to treat this implicit else branch as nothing at all in the various places an else in source code is handled.

I suspect that the rightful thing to do is to fix the two algorithms in sema (last use and definite initialization).

JohelEGP commented 8 months ago

Is it the use of std::map? A debug configuration of cppfront from branch main compiles reflect.h2 in 0.488s, whereas it takes 2.762s with this PR.

Brainstorming: Containers do tend to be slower in debug, have you compared release? Maybe try unordered_map? How big does this map get? Could you use a vector? Does all the population get done before you start lookup? If so, you could populate, sort, and do binary search for lookup. Could you put a final_position (looks like this means "sema visit order") in token rather than having to store it externally? It would have to be mutable since it's taken by const &.

Thank you. Switching to boost::container::flat_map lowers it to 1.6 s (-1 s from std::map). Adding it as a mutable member to token makes it take as long as the branch main.

gregmarr commented 8 months ago

You're welcome.

JohelEGP commented 8 months ago

Commit d63a0e28c278d72726c6d955d494080676cb4890 reverts the wrong approach to fixing #440 and #884. I set out on this PR because I wanted #884 fixed. I'll try fixing the symbol navigation algorithm that gives the wrong result.

JohelEGP commented 8 months ago

I ended up writing a new algorithm.

The previous one pre-recorded selection depths. It then attempted to make sense out of them while popping out of branches.

This new ones doesn't pre-record anything. Instead, once it finds a last use, it pops to the nearest containing scope. If that's a branch, it continues finding last uses in sibling branches. When it isn't a branch, it pops to the next containing scope, hopefully recursing the logic for success.

iprtel commented 8 months ago

Should this fix handle the following?

test_if : () = {
    a:_ = new<ResultSet>();
    if (a*.next()) {
    }

    b:ResultSet = ();
    if (b.next()) {
        _ = b.value(); // [1]
    }
    //_ = b; [2]
}

test_while: () = {
    a:_ = new<ResultSet>();
    while (a*.next()) {
    }

    b:ResultSet = ();
    while (b.next()) {
    }
}

in test_if I need either [1] or [2] for the local variable b but I don't need it in test_while but I would expect it to be required.

next : (inout this) -> bool = { return false; }
JohelEGP commented 8 months ago

No. Last uses that are part of the iteration of a loop are not implicitly moved. The if is executed only once, so the move on last use means you can't call the inout member in the condition without later discarding the last use.

hsutter commented 6 months ago

Whew! This is ~1100 lines, which I think makes it the biggest PR yet. Thanks for waiting for me to review it.

hsutter commented 6 months ago

@JohelEGP I think this is ready to merge. The only remaining thing is the three failing regression test runs, but those appear to be only because of diffs where the regression test baselines don't reflect the cpp2util.h line number changes, and so should be innocuous... does that sound right?

JohelEGP commented 6 months ago

That's right, according to the CI patches. Thank you for reviewing.

hsutter commented 6 months ago

Thanks!