Open shlok opened 1 month ago
Did you use fusion-plugin? Almost all fusion issues are taken care of by the fusion-plugin and by inlining all the functions that are part of the fused loop. The compilation guide recommends using the plugin for fusion.
fusion-plugin has several verbosity levels and it reports all the cases where fusion breaks. It also has a feature to print outputs from each optimization pass in separate files to compare the outputs pass by pass, but this has been broken in recent compilers.
Some guidelines are provided in the optimization guide.
Let me know if your problem is not covered by these or if there are improvements required to the docs, or discoverability of the docs can be improved somehow.
@harendra-kumar Thanks; I just tried. It doesn't look like the plugin makes a difference (unless of course I'm somehow just using it wrong).
I used GHC 9.4.8 (to make sure fusion-plugin's core dump feature works) and used these compiler options as instructed:
-Wall
-O2
-fdicts-strict
-fmax-worker-args=16
-fspec-constr-recursive=16
-fplugin=Fusion.Plugin
-fplugin-opt=Fusion.Plugin:dump-core
This is how I build my original example (with the & S.take 10
line that caused fusion to break): cabal clean && cabal build > ./core-dumps.txt
.
I still see lots of Step
s in the final main function.
Next I add -funfolding-use-threshold=1000
to the compiler options, and build like this: cabal clean && cabal build > ./core-dumps-with-extra-compiler-option.txt
Now the Step
s in the final main functions are gone.
So -funfolding-use-threshold=1000
seems to be needed, with or without the plugin.
I have attached the two files (only the final step to make the files smaller) so you can see the difference:
Thanks as always for a great library. I came across a thing I'd like to discuss.
Example (my setup:
streamly-0.10.1
;streamly-core-0.2.2
; GHC 9.4.8):cabal clean && cabal build --ghc-options="-O2 -ddump-simpl -ddump-to-file"
, fusion breaks. (This is how I've been naively spotting fusion breakage: I search the core dump for"Step"
; if there is no"Step"
undermain
, the program fused successfully; if"Step"
appears all over the place in main, fusion broke.)& S.take 10
, the fusion works. (I.e., just a simpletake
can make all the difference.)& S.take 10
, and compiling instead withcabal clean && cabal build --ghc-options="-O2 -ddump-simpl -ddump-to-file -funfolding-use-threshold=1000"
, the fusion works.Issue:
Proposed solution (feel free to provide better ones):
take
can break fusion (depending on other things).-funfolding-use-threshold
can help users fix the fusion.