inducer / loopy

A code generator for array-based code on CPUs and GPUs
http://mathema.tician.de/software/loopy
MIT License
565 stars 71 forks source link

Add special case to work around length-1 loop index removal in meshmode array contraction #813

Closed majosm closed 8 months ago

majosm commented 9 months ago

It's not pretty, but it seems to do the job. @inducer Is this kind of what you had in mind? If yes, I'll hand it off to Mike to do some more stress testing on lassen before we merge (I've only tested at small scales on my laptop so far).

majosm commented 9 months ago

Looks like I need an approval for CI.

inducer commented 8 months ago

Related: https://github.com/inducer/loopy/issues/809

inducer commented 8 months ago

I've finally had a chance to wrap my head around this and (hopefully) clarify the code. I've made the decision to not enable it by default, because (to my mind) the behavior isn't desirable other than as a workaround for mirgecom. I believe it is general and correct as written though.

inducer commented 8 months ago

@majosm, could you confirm that this still does the trick on the mirgecom end? (You'll need to pass _enable_mirgecom_workaround=True from the transform side.)

majosm commented 8 months ago

@majosm, could you confirm that this still does the trick on the mirgecom end? (You'll need to pass _enable_mirgecom_workaround=True from the transform side.)

Yep, looks like it works. 👍

inducer commented 8 months ago

Great, thanks! In it goes.