inducer / loopy

A code generator for array-based code on CPUs and GPUs
http://mathema.tician.de/software/loopy
MIT License
565 stars 71 forks source link

Private variables not supported with ISPC / Exception needs explanations #763

Closed dm-maxar closed 1 year ago

dm-maxar commented 1 year ago

With 609e143 , I observe that making a private variables with

import loopy as lp
import numpy as np
from loopy.version import LOOPY_USE_LANGUAGE_VERSION_2018_2

knl = lp.make_kernel(
    "{ [k]: 0<=k<K }",
    """
        <float32> b = 6.0 * float_pos[k]
        output[k] = 2.0 * b
        """, [lp.ValueArg("K", is_input=True),
              lp.GlobalArg("float_pos", np.float32, shape=lp.auto, is_input=True, is_output=False),
              lp.GlobalArg("output", np.uint8, shape=lp.auto, is_input=False, is_output=True)],
    target=lp.ISPCTarget(), assumptions='1<K')

knl = lp.set_temporary_address_space(knl, "b", "private")

cg_result = lp.generate_code_v2(knl)

print(cg_result.device_code())

leads to the fault

Traceback (most recent call last):
  File "<>\LoopyLearner\run2_ispc.py", line 17, in <module>
    cg_result = lp.generate_code_v2(knl)
  File "<>\loopy\loopy\codegen\__init__.py", line 626, in generate_code_v2
    cgr = generate_code_for_a_single_kernel(program[func_id],
  File "<>\loopy\loopy\codegen\__init__.py", line 413, in generate_code_for_a_single_kernel
    codegen_result = generate_host_or_device_program(
  File "<>\loopy\loopy\codegen\result.py", line 339, in generate_host_or_device_program
    codegen_result = build_loop_nest(codegen_state, schedule_index)
  File "<>\loopy\loopy\codegen\control.py", line 493, in build_loop_nest
    insn_group = build_insn_group(sched_index_info_entries, codegen_state)
  File "<>\loopy\loopy\codegen\control.py", line 486, in build_insn_group
    result = gen_code(new_codegen_state)
  File "<>\loopy\loopy\codegen\control.py", line 428, in gen_code
    inner = generate_code_for_sched_index(
  File "<>\loopy\loopy\codegen\control.py", line 53, in generate_code_for_sched_index
    codegen_result = generate_host_or_device_program(
  File "<>\loopy\loopy\codegen\result.py", line 334, in generate_host_or_device_program
    codegen_result = set_up_hw_parallel_loops(
  File "<>\loopy\loopy\codegen\loop.py", line 252, in set_up_hw_parallel_loops
    return next_func(codegen_state)
  File "<>\loopy\loopy\codegen\control.py", line 493, in build_loop_nest
    insn_group = build_insn_group(sched_index_info_entries, codegen_state)
  File "<>\loopy\loopy\codegen\control.py", line 486, in build_insn_group
    result = gen_code(new_codegen_state)
  File "<>\loopy\loopy\codegen\control.py", line 428, in gen_code
    inner = generate_code_for_sched_index(
  File "<>\loopy\loopy\codegen\control.py", line 98, in generate_code_for_sched_index
    return func(codegen_state, sched_index)
  File "<>\loopy\loopy\codegen\loop.py", line 441, in generate_sequential_loop_dim_code
    inner = build_loop_nest(new_codegen_state, sched_index+1)
  File "<>\loopy\loopy\codegen\control.py", line 493, in build_loop_nest
    insn_group = build_insn_group(sched_index_info_entries, codegen_state)
  File "<>\loopy\loopy\codegen\control.py", line 486, in build_insn_group
    result = gen_code(new_codegen_state)
  File "<>\loopy\loopy\codegen\control.py", line 428, in gen_code
    inner = generate_code_for_sched_index(
  File "<>\loopy\loopy\codegen\control.py", line 137, in generate_code_for_sched_index
    return codegen_state.try_vectorized(
  File "<>\loopy\loopy\codegen\__init__.py", line 272, in try_vectorized
    return func(self)
  File "<>\loopy\loopy\codegen\control.py", line 139, in <lambda>
    lambda inner_cgs: generate_instruction_code(inner_cgs, insn))
  File "<>\loopy\loopy\codegen\instruction.py", line 87, in generate_instruction_code
    ast = generate_assignment_instruction_code(codegen_state, insn)
  File "<>\loopy\loopy\codegen\instruction.py", line 161, in generate_assignment_instruction_code
    result = codegen_state.ast_builder.emit_assignment(codegen_state, insn)
  File "<>\loopy\loopy\target\ispc.py", line 475, in emit_assignment
    return Assign(ecm(lhs, prec=PREC_NONE, type_context=None), rhs_code)
  File "<>\loopy\loopy\target\c\codegen\expression.py", line 136, in __call__
    self.rec(expr, type_context, needed_dtype))
  File "<>\loopy\loopy\target\c\codegen\expression.py", line 123, in rec
    return RecursiveMapper.rec(self, expr, type_context)
  File "<>\pymbolic\mapper\__init__.py", line 141, in __call__
    result = method(expr, *args, **kwargs)
  File "<>\loopy\loopy\target\ispc.py", line 96, in map_variable
    gsize, lsize = self.kernel.get_grid_size_upper_bounds_as_exprs()
TypeError: LoopKernel.get_grid_size_upper_bounds_as_exprs() missing 1 required positional argument: 'callables_table'

In looking at line 96 of ispc.py, it looks a bit like there is just not an argument to that get_grid_size_upper_bounds_as_exprs method and there needs to be one. Maybe private variable functionality hasn't been implemented yet or, if it has, maybe this needs an Exception that better helps the user understand how to avoid this particular execution path.

inducer commented 1 year ago

Thanks for the report! #764