Open andriusstank opened 3 years ago
Good catch!
On Tue, 20 Jul 2021 at 12:29, Andrius Stankevičius @.***> wrote:
Simple example with fibonacci numbers:
n :: Int n = 50
fibN :: Num a => a -> a fibN x0 = fibs !! n where fibs = 0 : x0 : zipWith (+) fibs (tail fibs)
fibInt :: Integer fibInt = fibN 1
fibAD :: (Integer, Integer) fibAD = rad1g 0 1 fibN 1
fibInt evaluates instantly. fibAD takes a long time and throws stack overflow exception.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ocramz/ad-delcont/issues/3, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNBDKBDDNKNBHFZGVVSA2TTYVFXRANCNFSM5AVQOBHA .
I don't know what to make of this though. On one hand, this example is not what I'd call a typical application of AD. But you're right, there's certainly a fundamental limitation in the code somewhere. I hope someone with more expertise in debugging laziness issues can shed some light on this.
Yeah, there is some potential for lazyness to build up in some places, e.g. the Num
instance (+) = op2Num $ \x y -> (x + y, id, id)
. If you have a lot of additions you might end up with fibInt = fibN 1 50 = 12586269025
as a thunk 1+ 1 + 1 + ...
, which can easily blow up memory.
Just replacing it by (+) = op2Num $ \x y -> let sum = x + y in sum
seq(sum, id, id)
doesn't fix it yet though. But since you're building up thunks of function compositions in the other elements of the tuple, it's hard to prevent a space leak there I'd guess since that can't be easily normalised.
Thank you @turion . It really sounds like a radical redesign is in order. In its current version, ad-delcont
is mostly a straightforward port of the implementation from the Wang article, which they were able to pull off by using Scala which is eagerly-evaluated (IIRC).
Somewhat simper example:
forkJoin :: Fractional a => a -> a
forkJoin x = (x+x)/2
forkJoinChain :: Fractional a => a -> a
forkJoinChain x = iterate forkJoin x !! 20
forkJoinChainAD :: Float -> (Float, Float)
forkJoinChainAD x = rad1g 0 1 forkJoinChain x
I don't think lazyness is a problem here. At worst it may cause memory consumption to be proportional to the amount of computation, which shouldn't be an issue with only tens of variables.
My guess would be that this piece of code causes combinatorial explosion:
op2_ zeroa plusa plusb f ioa iob = do
ra <- ioa
rb <- iob
ioa
and iob
are computations, that are evaluated every time they are needed. Each of them has two parents, each of them in turn also has two parents, etc.
Maybe AD
variable should also store the value and STRef
for gradient along with ContT
computation, maybe some flag to avoid repeated evaluations. Like laziness, but manual.
Anyway, I don't really grok the code, so it's all just wild guesses.
Have you tried UseStrict to see if it goes away?
@tonyday567 thanks but I just tried, -XStrict doesn't help.
Simple example with fibonacci numbers:
fibInt
evaluates instantly.fibAD
takes a long time and throws stack overflow exception.