Open wasowski opened 2 years ago
Page 134 in fpinscala book has an indirect inspiration: try to avoid passing around the list as a pointer to the head but instead embed the calculation of the function you pass it to in a map, or other suitable operator (fold?) which is lazy. It seems that the LazyList
programming is really not very good if you have to do more than one expression with the list. Ideally all list elements, including the head should be collectable temporaries.
PR #232 advances the performance quite a bit, but we are far from not using memory at all. But we might be getting close to what an imperative, say Python, program would use if implemented following the same high level pattern of learning and storing policies first, then evaluating later. Some directions to push it further:
probula
and implement Randomized3
using spire
directly. This should reduce space overhead by a constant factor too.
The stack overflow problems are solved (issue #53), but we still have OOM problems, at about 400K episodes with
simplemaze
. This most likely means that we stick to some memory held by theRandomized
schedule, instead of consuming this memory on the fly when learning.Interestingly there is no memory leaks with
UnitAgent
- this could be because the Q matrix is never updated, or it is updated very few times. This could also be because it is notlearn
butlearningEpisode
that leaks viaiterateUntilM
(and this does not show because all episodes terminate immediately forUnitAgent
). It could be good to start by devising two tests that check these hypotheses.Another hypothesis to be tested:
In principle,
f
consumes the list, and does not need the elements in memory, but head ofl
in the caller is referenced from a named value in the caller, which should prevent the garbage collector from freeing memory. For a sufficiently large 'C' we should run out of heap space. I doubt that calling directlyf(from(1).take(C))
would make any difference - but it is worth checking.