numba / numba-rvsdg

Numba compatible RVSDG (Regionalized Value State Dependence Graph) utilities.
https://numba-rvsdg.readthedocs.io/
BSD 2-Clause "Simplified" License
18 stars 7 forks source link

Branch transformation creates invalid edge #80

Open sklam opened 1 year ago

sklam commented 1 year ago

See images below. Loop block has edge violating the region structures.

Reproducer 1

YAML:

"mock_block_0":
    jt: ["mock_block_2", "mock_block_10"]
"mock_block_2":
    jt: ["mock_block_5", "mock_block_11"]
"mock_block_5":
    jt: ["mock_block_10", "mock_block_5"]
"mock_block_10":
    jt: ["mock_block_11"]
"mock_block_11":
    jt: []

original graph: Screenshot 2023-06-23 at 10 50 17 AM section of transformed graph showing the bad edge: Screenshot 2023-06-23 at 10 49 59 AM

Reproducer 2

YAML:

"mock_block_0":
    jt: ["mock_block_11", "mock_block_2"]
"mock_block_1":
    jt: ["mock_block_7", "mock_block_6"]
"mock_block_2":
    jt: ["mock_block_3"]
"mock_block_3":
    jt: ["mock_block_4"]
"mock_block_4":
    jt: ["mock_block_5"]
"mock_block_5":
    jt: ["mock_block_4", "mock_block_1"]
"mock_block_6":
    jt: ["mock_block_18", "mock_block_5"]
"mock_block_7":
    jt: ["mock_block_10", "mock_block_8"]
"mock_block_8":
    jt: ["mock_block_5", "mock_block_11"]
"mock_block_9":
    jt: ["mock_block_3", "mock_block_12"]
"mock_block_10":
    jt: ["mock_block_10", "mock_block_6"]
"mock_block_11":
    jt: ["mock_block_12"]
"mock_block_12":
    jt: ["mock_block_15", "mock_block_17"]
"mock_block_15":
    jt: ["mock_block_9", "mock_block_17"]
"mock_block_17":
    jt: ["mock_block_18"]
"mock_block_18":
    jt: []

section of transformed graph showing the bad edge: Screenshot 2023-06-23 at 10 43 24 AM

sklam commented 1 year ago

I have a python bytecode based reproducer for this:


from numba_rvsdg.core.datastructures.byte_flow import ByteFlow
from numba_rvsdg.rendering.rendering import ByteFlowRenderer

def foo(x, y, z):
    if x:
        while z:
            print("A")
            z = 0
    print("B")
    return x, y, z

byteflow = ByteFlow.from_bytecode(foo.__code__)
bcmap = byteflow.scfg.bcmap_from_bytecode(byteflow.bc)
byteflow = byteflow.restructure()

bfr = ByteFlowRenderer()
bfr.bcmap_from_bytecode(byteflow.bc)
byteflow.scfg.view("scfg")

problematic edge: Screenshot 2023-06-27 at 1 49 04 PM

esc commented 1 year ago

Looking at the following screenshot:

Screen Shot 2023-06-29 at 10 27 08

I think that edge from basic_block_2 should go to synth_assign_block_2 and not basic_block_3. I think that this is the original target for basic_block_2 -- so I think this is a bug, either in the updating when inserting the synth_assign_block_2 or when updating the jump targets of the block.

esc commented 1 year ago

Debugging this some more with a breakpoint at:

diff --git i/numba_rvsdg/core/transformations.py w/numba_rvsdg/core/transformations.py
index 812024def4..0d3747cdbc 100644
--- i/numba_rvsdg/core/transformations.py
+++ w/numba_rvsdg/core/transformations.py
@@ -469,6 +469,7 @@ def restructure_branch(parent_region: RegionBlock):
     # Populate any empty branch regions by inserting a SyntheticBranch.
     for region in branch_regions:
         if region:
+            breakpoint()
             bra_start, inner_nodes = region
             if inner_nodes:
                 # Insert SyntheticTail

I see:

RegionBlock(name='loop_region_0',
            _jump_targets=('synth_asign_block_2',),

So the loop_region_0 region block does point to the right place.

But then inside the RegionBlock:

            subregion=SCFG(graph={'basic_block_2': BasicBlock(name='basic_block_2',
                                                              _jump_targets=('basic_block_3',
                                                                             'basic_block_2'),
                                                              backedges=('basic_block_2',))},
                           name_gen=NameGenerator(kinds={'basic': 5,
                                                         'control': 1,
                                                         'loop': 1,
                                                         'meta': 2,
                                                         'synth_asign': 3,
                                                         'synth_head': 1}),
                           region=RegionBlock(name='loop_region_0',
                                              _jump_targets=('basic_block_3',),

I am not sure what is going on here, but it seems like there are multiple jump_targets that are not being updated?

For reference, here is the complete printout of the datastructure:

(Pdb++) lr = scfg.graph['loop_region_0']
(Pdb++) pp(lr)
RegionBlock(name='loop_region_0',
            _jump_targets=('synth_asign_block_2',),
            backedges=(),
            kind='loop',
            parent_region=RegionBlock(name='meta_region_0',
                                      _jump_targets=(),
                                      backedges=(),
                                      kind='meta',
                                      parent_region=None,
                                      header=None,
                                      subregion=SCFG(graph={'basic_block_0': BasicBlock(name='basic_block_0',
                                                                                        _jump_targets=('basic_block_1',
                                                                                                       'synth_asign_block_0'),
                                                                                        backedges=()),
                                                            'basic_block_1': BasicBlock(name='basic_block_1',
                                                                                        _jump_targets=('loop_region_0',
                                                                                                       'synth_asign_block_1'),
                                                                                        backedges=()),
                                                            'basic_block_3': BasicBlock(name='basic_block_3',
                                                                                        _jump_targets=('basic_block_4',),
                                                                                        backedges=()),
                                                            'basic_block_4': BasicBlock(name='basic_block_4',
                                                                                        _jump_targets=(),
                                                                                        backedges=()),
                                                            'loop_region_0': <Recursion on RegionBlock with id=4313593296>,
                                                            'synth_asign_block_0': SyntheticAssignment(name='synth_asign_block_0',
                                                                                                       _jump_targets=('synth_head_block_0',),
                                                                                                       backedges=(),
                                                                                                       variable_assignment={'control_var_0': 0}),
                                                            'synth_asign_block_1': SyntheticAssignment(name='synth_asign_block_1',
                                                                                                       _jump_targets=('synth_head_block_0',),
                                                                                                       backedges=(),
                                                                                                       variable_assignment={'control_var_0': 1}),
                                                            'synth_asign_block_2': SyntheticAssignment(name='synth_asign_block_2',
                                                                                                       _jump_targets=('synth_head_block_0',),
                                                                                                       backedges=(),
                                                                                                       variable_assignment={'control_var_0': 2}),
                                                            'synth_head_block_0': SyntheticHead(name='synth_head_block_0',
                                                                                                _jump_targets=('basic_block_3',
                                                                                                               'basic_block_4'),
                                                                                                backedges=set(),
                                                                                                variable='control_var_0',
                                                                                                branch_value_table={0: 'basic_block_3',
                                                                                                                    1: 'basic_block_4',
                                                                                                                    2: 'basic_block_3'})},
                                                     name_gen=NameGenerator(kinds={'basic': 5,
                                                                                   'control': 1,
                                                                                   'loop': 1,
                                                                                   'meta': 2,
                                                                                   'synth_asign': 3,
                                                                                   'synth_head': 1}),
                                                     region=...),
                                      exiting=None),
            header='basic_block_2',
            subregion=SCFG(graph={'basic_block_2': BasicBlock(name='basic_block_2',
                                                              _jump_targets=('basic_block_3',
                                                                             'basic_block_2'),
                                                              backedges=('basic_block_2',))},
                           name_gen=NameGenerator(kinds={'basic': 5,
                                                         'control': 1,
                                                         'loop': 1,
                                                         'meta': 2,
                                                         'synth_asign': 3,
                                                         'synth_head': 1}),
                           region=RegionBlock(name='loop_region_0',
                                              _jump_targets=('basic_block_3',),
                                              backedges=(),
                                              kind='loop',
                                              parent_region=RegionBlock(name='meta_region_0',
                                                                        _jump_targets=(),
                                                                        backedges=(),
                                                                        kind='meta',
                                                                        parent_region=None,
                                                                        header=None,
                                                                        subregion=SCFG(graph={'basic_block_0': BasicBlock(name='basic_block_0',
                                                                                                                          _jump_targets=('basic_block_1',
                                                                                                                                         'synth_asign_block_0'),
                                                                                                                          backedges=()),
                                                                                              'basic_block_1': BasicBlock(name='basic_block_1',
                                                                                                                          _jump_targets=('loop_region_0',
                                                                                                                                         'synth_asign_block_1'),
                                                                                                                          backedges=()),
                                                                                              'basic_block_3': BasicBlock(name='basic_block_3',
                                                                                                                          _jump_targets=('basic_block_4',),
                                                                                                                          backedges=()),
                                                                                              'basic_block_4': BasicBlock(name='basic_block_4',
                                                                                                                          _jump_targets=(),
                                                                                                                          backedges=()),
                                                                                              'loop_region_0': <Recursion on RegionBlock with id=4313593296>,
                                                                                              'synth_asign_block_0': SyntheticAssignment(name='synth_asign_block_0',
                                                                                                                                         _jump_targets=('synth_head_block_0',),
                                                                                                                                         backedges=(),
                                                                                                                                         variable_assignment={'control_var_0': 0}),
                                                                                              'synth_asign_block_1': SyntheticAssignment(name='synth_asign_block_1',
                                                                                                                                         _jump_targets=('synth_head_block_0',),
                                                                                                                                         backedges=(),
                                                                                                                                         variable_assignment={'control_var_0': 1}),
                                                                                              'synth_asign_block_2': SyntheticAssignment(name='synth_asign_block_2',
                                                                                                                                         _jump_targets=('synth_head_block_0',),
                                                                                                                                         backedges=(),
                                                                                                                                         variable_assignment={'control_var_0': 2}),
                                                                                              'synth_head_block_0': SyntheticHead(name='synth_head_block_0',
                                                                                                                                  _jump_targets=('basic_block_3',
                                                                                                                                                 'basic_block_4'),
                                                                                                                                  backedges=set(),
                                                                                                                                  variable='control_var_0',
                                                                                                                                  branch_value_table={0: 'basic_block_3',
                                                                                                                                                      1: 'basic_block_4',
                                                                                                                                                      2: 'basic_block_3'})},
                                                                                       name_gen=NameGenerator(kinds={'basic': 5,
                                                                                                                     'control': 1,
                                                                                                                     'loop': 1,
                                                                                                                     'meta': 2,
                                                                                                                     'synth_asign': 3,
                                                                                                                     'synth_head': 1}),
                                                                                       region=...),
                                                                        exiting=None),
                                              header='basic_block_2',
                                              subregion=...,
                                              exiting='basic_block_2')),
            exiting='basic_block_2')
esc commented 1 year ago

Using the patch:

diff --git i/numba_rvsdg/core/transformations.py w/numba_rvsdg/core/transformations.py
index 812024def4..b73a7446ea 100644
--- i/numba_rvsdg/core/transformations.py
+++ w/numba_rvsdg/core/transformations.py
@@ -438,6 +438,7 @@ def restructure_branch(parent_region: RegionBlock):
     immdoms = _imm_doms(doms)
     regions = [r for r in _iter_branch_regions(scfg, immdoms, postimmdoms)]

+
     # Early exit when no branching regions are found.
     # TODO: the whole graph should become a linear mono head
     if not regions:
@@ -451,12 +452,15 @@ def restructure_branch(parent_region: RegionBlock):
         scfg, begin, head_region_blocks, branch_regions
     )

+    breakpoint()
     # Unify headers of tail subregion if need be.
     headers, entries = scfg.find_headers_and_entries(tail_region_blocks)
     if len(headers) > 1:
+        breakpoint()
         end = scfg.name_gen.new_block_name(block_names.SYNTH_HEAD)
         scfg.insert_block_and_control_blocks(end, entries, headers)

+    breakpoint()
     # Recompute regions.
     head_region_blocks = find_head_blocks(scfg, begin)
     branch_regions = find_branch_regions(scfg, begin, end)

I was able to identify that the error is caused by the call to scfg.insert_block_and_control_blocks(end, entries, headers) which only updates the top-level jump_targets of the loop_region_0 type RegionBlock. But does not update the jump_targets of the second loop_region_0 block nor the basic_block_2 that is contained within.