dhall-lang / dhall-haskell

Maintainable configuration files
https://dhall-lang.org/
BSD 3-Clause "New" or "Revised" License
908 stars 211 forks source link

optimize Natural/fold in the strict case #2585

Closed winitzki closed 3 weeks ago

winitzki commented 1 month ago

The proposed optimization follows the discussion in https://github.com/dhall-lang/dhall-lang/issues/1213#issuecomment-1855878600

The beta-normalization for Natural/fold n t succ zero corresponds to evaluating succ(succ(succ(...(succ(zero))...))) where we apply succ exactly n times.

The current code uses a loop where succ is always evaluated n times. However, sometimes the loop can be stopped early because the values stop changing.

For example:

let f : Natural → Natural = λ(x : Natural) → if Natural/isZero x then 1 else x
in Natural/fold 10000000 Natural f 5

With this definition, we have f x = x when x is nonzero. The expression evaluates to 5, but the evaluation takes time linear in the first argument (10000000) because the function f will be applied that many times to the value 5 and beta-normalization will be performed each time.

The new code rewrites strictLoop via strictLoopShortcut, which stops the loop as soon as f x = x.

This optimization is only applied in the "strict" case (Normalization.hs introduces strict and lazy cases when evaluating Natural/fold.)

There should be no change of behavior after this optimization. This is because all Dhall functions are pure; if f x = x then the result of evaluating f (f (f x)) is exactly the same as just x. So, the only effect of the new code is that some Natural/fold expressions will be beta-normalized faster than before.

I'd like to add some tests for the new behavior but I don't know how. The new behavior returns exactly the same normal forms as before, but does it faster when the "shortcut" is detected (when f x = x is detected). I am not a Haskell programmer, so any suggestions will be welcome here.

Another issue is that I am not familiar with the Haskell build systems, and both cabal new-build dhall as well as stack build fail on my machine in this project, with dependency errors that I am unable to fix.

winitzki commented 1 month ago

@Gabriella439 Thank you for the suggestion.

I'm not sure why the build fails for MacOS. I also have trouble building dhall-haskell on my Mac. I opened another PR to see if the project builds with no changes. https://github.com/dhall-lang/dhall-haskell/pull/2586 And the macOS build fails there too (while all other OSes succeed).

Update - this is fixed now.

winitzki commented 1 month ago

@Gabriella439 My changes do not actually seem to work. I built a local executable (stack build dhall) and ran it (stack exec dhall) but I don't observe any speedup in the test code shown here.

let f : Natural → Natural = λ(x : Natural) → if Natural/isZero x then 1 else x
in Natural/fold 100000000 Natural f 0

takes about 5 seconds but should be instantaneous. Am I doing something wrong?

I changed the Dhall version to 1.42.2 in dhall.cabal, I tried using strict always instead of lazy, and I tried adding Debug.Trace, and I even tried removing the entire App (App (App (App NaturalFold (NaturalLit n0)) t) succ') zero -> do code block, but nothing helped. In fact, I don't seem to be able to change the behavior of dhall in any way through local code changes.

$ stack build dhall
...
Installing executable dhall in /Users/sergei.winitzki/Code/winitzki-dhall-haskell/.stack-work/install/x86_64-osx/5cb58a553262eb646270303a92d2143ba964e30fb089950b829eb0787bd00e99/9.2.8/bin
Registering library for dhall-1.42.2..
$ stack exec dhall repl
Welcome to the Dhall v1.42.2 REPL! Type :help for more information.
⊢ :let f : Natural → Natural = λ(x : Natural) → if Natural/isZero x then 1 else x

f : Natural → Natural

⊢ Natural/fold 100000000 Natural f 0

1

This still takes about 5 seconds.

How could this even work if I compiled dhall with no code for App (App (App (App NaturalFold (NaturalLit n0)) t) succ') zero -> do? I know the code was recompiled because I get messages about recompiling Normalize.hs and a bunch of other files.

Am I editing the wrong place in the source code? I see that some code involving NaturalFold is also present in Eval.hs. What is the difference between Eval.hs and Normalize.hs?

winitzki commented 1 month ago

I added some new code to Eval.hs and now the test code executes instantaneously. The shortcut seems to be working.

Why is Natural/fold implemented in two different places and in two different ways?

For Eval.hs I couldn't figure out how to write that code properly. To detect the shortcut, I need to compare two values of type Val a but I don't see how to do that in general. Just for testing, I implemented a specific comparison in case the values are of type VNaturalLit but this is of course not satisfactory.

winitzki commented 3 weeks ago

@Gabriella439 You merged the PR but I still think the work is not finished. See my comment in the code.