Open steven-johnson opened 4 years ago
As usual, nearly all the time is in llvm. So we must be generating pathological .ll
So I'm not sure how to fix this without just not using this schedule - it scalarizes like crazy. I guess I could try to see how to avoid scalarization. It seems to be in the tail case though.
Let me look at the revision history and see if that tells me anything.
Adding @vksnk as he was involved in both the google-specific code and the version going into apps/nn_ops -- maybe he can shed light on whether the simpler version in apps/nn_ops came first (and the more complex version in google was a later rev), or vice versa, and why there is a divergence.
Yeah, this generator was always the slowest one to compile. I think the one in google is a more recent and, if I remember correctly, has a slightly better schedule for some of the corner cases. That being said, I don't think this particular implementation is actively used right now, so probably should be okay to comment part of the schedule out to unblock the change. I can take care of it.
Although, we still probably want to report this to QC, so they can take a look (especially that there is a public repro case).
Was this fixed by #4744?
Steps to repeat. We are going to build and run one of the Generators in
apps/nn_ops
:At current master branch, on my stupid fast Linux box, I get:
To see the pathological schedule, go to
apps/nn_ops/DepthwiseConvoution_generator.cpp
and change the linesto
Now, re-run the steps above (still on
master
). I get