Open xywei opened 3 years ago
Thanks for reporting this! #483 fixes the unassigned variable (which is definitely a bug). ed5d1458abb07f7d30de4854b1e4f427480e52df was an attempt to make type inference succeed, but given that insn1
is self-referential, I don't know that it should succeed... so I gave up on that approach.
@kaushikcfd, any thoughts on this?
We could hack so that the type-inference doesn't traverse instructions with circular dependencies like insn2
. Here's the patch for it, which infers the types correctly. (I guess it would be better if I open a PR for it)
diff --git a/loopy/type_inference.py b/loopy/type_inference.py
index 867f6893..2c496913 100644
--- a/loopy/type_inference.py
+++ b/loopy/type_inference.py
@@ -689,6 +689,7 @@ def _infer_var_type(kernel, var_name, type_inf_mapper, subst_expander):
import loopy as lp
type_inf_mapper = type_inf_mapper.copy()
+ had_raised_dep_failure = False
for writer_insn_id in kernel.writer_map().get(var_name, []):
writer_insn = kernel.id_to_insn[writer_insn_id]
@@ -698,32 +699,36 @@ def _infer_var_type(kernel, var_name, type_inf_mapper, subst_expander):
expr = subst_expander(writer_insn.expression)
debug(" via expr %s", expr)
- if isinstance(writer_insn, lp.Assignment):
- result = type_inf_mapper(expr, return_dtype_set=True)
- elif isinstance(writer_insn, lp.CallInstruction):
- return_dtype_sets = type_inf_mapper(expr, return_tuple=True,
- return_dtype_set=True)
-
- result = []
- for return_dtype_set in return_dtype_sets:
- result_i = None
- found = False
- for assignee, comp_dtype_set in zip(
- writer_insn.assignee_var_names(), return_dtype_set):
- if assignee == var_name:
- found = True
- result_i = comp_dtype_set
- break
-
- assert found
- if result_i is not None:
- result.append(result_i)
-
- debug(" result: %s", result)
+ try:
+ if isinstance(writer_insn, lp.Assignment):
+ result = type_inf_mapper(expr, return_dtype_set=True)
+ elif isinstance(writer_insn, lp.CallInstruction):
+ return_dtype_sets = type_inf_mapper(expr, return_tuple=True,
+ return_dtype_set=True)
- dtype_sets.append(result)
+ result = []
+ for return_dtype_set in return_dtype_sets:
+ result_i = None
+ found = False
+ for assignee, comp_dtype_set in zip(
+ writer_insn.assignee_var_names(), return_dtype_set):
+ if assignee == var_name:
+ found = True
+ result_i = comp_dtype_set
+ break
+
+ assert found
+ if result_i is not None:
+ result.append(result_i)
+ debug(" result: %s", result)
+
+ dtype_sets.append(result)
+ except DependencyTypeInferenceFailure:
+ had_raised_dep_failure = True
if not dtype_sets:
+ if had_raised_dep_failure:
+ raise DependencyTypeInferenceFailure
return (
None, type_inf_mapper.symbols_with_unknown_types, None,
type_inf_mapper.clbl_inf_ctx)
That seems similar to ed5d1458abb07f7d30de4854b1e4f427480e52df in its approach, however I'm not sure either approach necessarily yields a consistent treatment of self-referential assignments. Just skipping them basically ignores type constraints that arise from those. I'm not sure that's what we want. Upon further thought, ed5d1458abb07f7d30de4854b1e4f427480e52df might actually be OK: It considers non-self-referential assignments and tests them for consistency using all other assignments (in a second round), including the self-referential ones. Maybe that's what we should do. I'm not sure I had fully grasped the importance of the "second round" performed by the code when I dismissed the approach.
Hadn't looked at https://github.com/inducer/loopy/commit/ed5d1458abb07f7d30de4854b1e4f427480e52df, that LGTM. On trying it with small kernels with reduction-like instruction pairs, it yielded the correct results (due to the second round).
Type inference on the following kernel fails with message
UnboundLocalError: local variable 'symbols_with_unavailable_types' referenced before assignment
:The kernel compiles if the conditional in
insn2
is not an expression ofx[i]
, e.g.,This issue was revealed in
volumential
's table building https://gitlab.tiker.net/xywei/volumential/-/jobs/305193.