Open sv2518 opened 2 years ago
Thanks for the report! That's a concerning bug. Arguments must not be dropped in entrypoint kernels. I think this should be an easy fix. Here's a smaller reproducer:
knl = lp.make_kernel(
"{ : }",
"""
a[0] = 1
""",
[lp.GlobalArg("a,b,c,d,e",
shape=(10,),
dtype="float64")])
print(lp.generate_code_v2(knl).device_code())
which generates the kernel:
__kernel void __attribute__ ((reqd_work_group_size(1, 1, 1))) loopy_kernel(__global double *__restrict__ a)
{
a[0] = 1.0;
}
Notice how the arguments b,c,d,e
were removed.
Hi. I noticed an issue with the argument passing that has been introduced in https://github.com/inducer/loopy/pull/631 (I believe).
When we inline an inner kernel in a wrapper kernel, where the inner kernels takes many arguments but the code only depends on a subset of it, then in the C code, the inlined kernel only takes a subset of the arguments as parameters. A minimal example can be found here https://gist.github.com/sv2518/d6690f63b6827750fdaed9b01b4d9cea. I noticed that this only happens to
CTargets
, not for whatever Loo.py's default target is.This is problematic in Firedrake for the vectorisation where we generate a loopy kernel first, do the inlining and vectorisation transformations and C codegen after that, and then pass the arguments dynamically. Meaning the arguments for the kernel before the transformations must be the same as after it. The MFE I linked above is coming from our test suite, the corresponding test is
tests/regression/test_par_loops.py::test_dict_order_parallel.
Also sorry for the noise recently, it's mostly because I am still trying to get the Firedrake vectorisation PR landed.