Ensure the return path analysis traverses the entire CFG

jjcnn commented 2 weeks ago

Description

The return path analysis ensures that if a method must return a value then every path through a method returns a value. The analysis traverses the CFG of each method in a breadth-first manner from the (unique) entry point to the (unique) exit point. If there is ever a node in the graph that does not have outgoing edges, then an error is thrown.

So far the condition in the outer loop has been !rovers.is_empty() && rovers[0] != exit_point, where rovers is the current set of nodes to be analyzed. The requirement rovers[0] != exit_point potentially causes part of the CFG to not be traversed, but appears to have been added because of cases like the following:

fn main() -> u64 {
    impl core::ops::Eq for X {
        fn eq(self, other: Self) -> bool {
            asm(r1: self, r2: other, r3) {
                eq r3 r2 r1;
                r3: bool
            }
        }
    }
    if X::Y(true) == X::Y(true) {
        a
    } else {
        a
    }
}

The CFG for main contains the CFG for eq as a subgraph, but since the exit point for eq does not have a path leading to the exit point of main the analysis throws an error when the condition is removed. (It is not clear to me why this is, but based on my testing it happens consistently).

This is clearly incorrect behavior since the edge leading into the eq CFG does not represent control flow.

To solve this problem I have introduced a condition in the inner loop in which a node is ignored if it has been visited before, or if it represents a method declaration that is different from the entry point of the method currently under analysis.

Additionally I have changed the representation of visited to be a HashSet instead of a vector. This should improve performance slightly.

I have not been able to find a test program that causes the original algorithm to allow illegal programs to pass, but the new algorithm seems more obviously correct.

Checklist

[x] I have linked to any relevant issues.
[x] I have commented my code, particularly in hard-to-understand areas.
[x] I have updated the documentation where relevant (API docs, the reference, and the Sway book).
- [ ] If my change requires substantial documentation changes, I have requested support from the DevRel team
[x] I have added tests that prove my fix is effective or that my feature works.
[x] I have added (or requested a maintainer to add) the necessary Breaking* or New Feature labels where relevant.
[x] I have done my best to ensure that my PR adheres to the Fuel Labs Code Review Standards.
[x] I have requested a review from the relevant team or maintainers.

codspeed-hq[bot] commented 2 weeks ago

CodSpeed Performance Report

Merging #6541 will improve performances by 20.82%

_{Comparing jjcnn/return_path_analysis_exit_fix (8a777e9) with master (7b07216)}

Summary

⚡ 1 improvements ✅ 21 untouched benchmarks

Benchmarks breakdown

	Benchmark	`master`	`jjcnn/return_path_analysis_exit_fix`	Change
⚡	`document_symbol`	5.2 ms	4.3 ms	+20.82%

jjcnn commented 2 weeks ago

Can we test this to validate some previously unhandled state at least?

I haven't been able to find an example where the old algorithm gave an incorrect result. I can keep looking if you want?

When I removed the rovers[0] != exit_point check I got an error in this test case (as well as in a number of others):

https://github.com/FuelLabs/sway/tree/master/test/src/e2e_vm_tests/test_programs/should_pass/language/enum_in_fn_decl

That's the one I used to debug my algorithm.

FuelLabs / sway