Closed basdebakker closed 4 years ago
great catch!!
@NthPortal have time to take a look?
I think that is expected, because the test is overly strict.
When we evaluate whether a lazy list is empty or not, we evaluate its head, and we create a new instance of LazyList
for its tail. However, that LazyList
is completely unevaluated: we don't know whether it's empty or not, and we most definitely haven't computed its head.
The test, as written, insists on not wanting the instance of LazyList
to be created. But that's too strict. In practice what we want is that the state of the tail hasn't been computed yet.
ah, I believe you are right, thanks for thinking about this more carefully than I did
even after understanding what's going on, though, "I expect the behavior of LazyList to be at least as lazy as Stream" still nags at my intuition somewhat. but I guess it's my intuition that needs to fall in line
fwiw, this (the delay in the println
happening) also goes against what my intuition was:
scala 2.13.2> { println("hey!"); 0 } #:: LazyList()
val res9: scala.collection.immutable.LazyList[Int] = LazyList(<not computed>)
scala 2.13.2> res9.isEmpty
hey!
val res10: Boolean = false
but perhaps that's because I used Stream
for a decade before LazyList
came along, and so my intuitions developed in a certain direction
The problem is that I can't construct the LazyList
instance for the tail yet. My version of the stream1
method looks more like
if (condition(args)) head #:: recursiveCall(newArgs) else Stream.empty
and it is the evaluation of condition
that I also need to postpone, but is now evaluated when I replace Stream
by LazyList
.
Hum there should be a way to first create a LazyList
with a function that delays the moment where you have to do the if (condition(args))
. That's the one true benefit of the new design. That said just looking at the public API, I don't see a way to do that in user space. @SethTisue @NthPortal What would be the user-space to do that?
Otherwise, you might be able to achieve your end goal with LazyList.unfold
.
Hum there should be a way to first create a
LazyList
with a function that delays the moment where you have to do theif (condition(args))
Otherwise, you might be able to achieve your end goal with
LazyList.unfold
.
unfold
is the intended way to defer the evaluation of the condition; or more precisely, to defer the evaluation of whether or not it is empty. Alternatively, you can create an Iterator
that produces the elements and call from
with it as the arg, though this may be a bit more work. Obviously, #::
(and #:::
) cannot be used to defer the evaluation of whether or not it is empty, as it only produces non-empty LazyList
s.
Like @sjrd, I am unsure whether or not this is a bug. It is fixable though (I've already come up with a fix) if we decide it needs to be fixed.
The fix, for reference, is to change LazyList.Deferrer.#::
to this:
def #:: [B >: A](elem: => B): LazyList[B] = newLL(sCons(elem, newLL(l().state)))
sadly, no one thought of a when
method for the LazyList
companion, and we can't add one now :(
@basdebakker as Seb said, unfold
does pretty much what you want, without too much restructuring.
LazyList.unfold(args) { args =>
if (condition(args)) Some((head, newArgs)) else None
}
alternatively, if you want to keep your current structure, you can use the following workaround hack of inserting a LazyList.empty #:::
.
if (condition(args)) head #:: LazyList.empty #::: recursiveCall(newArgs) else LazyList.empty
// or
LazyList.empty #::: { if (condition(args)) head #:: recursiveCall(newArgs) else LazyList.empty }
Thanks. I rewrote my code (which is a bit more complex) using LazyList.unfold
and it works for me. I'll even say it is now better than it was before.
Still, my old code was a common pattern and from what I read I had expected to use LazyList
as a drop-in replacement for Stream
.
After some thought, I don't think it's a bug, for the following reason: on the first iteration, the condition is not evaluated lazily (which is something LazyList
supports), so the pattern isn't properly lazy in the first place; consequently, you shouldn't expect it to be properly lazy as a tail either.
Thoughts?
question for @basdebakker:
"from what I read I had expected to use LazyList as a drop-in replacement for Stream"
is there some documentation we ought to improve? https://github.com/scala/scala/releases/tag/v2.13.0 currently says "immutable.LazyList replaces immutable.Stream. Stream had different laziness behavior and is now deprecated", with links to PRs
two questions for @NthPortal:
sadly, no one thought of a
when
method for theLazyList
companion, and we can't add one now"
what would the signature of that be...?
The fix, for reference, is to change
LazyList.Deferrer.#::
to this:def #:: [B >: A](elem: => B): LazyList[B] = newLL(sCons(elem, newLL(l().state)))
I see that the current definition is:
def #:: [B >: A](elem: => B): LazyList[B] = newLL(sCons(elem, l()))
what would the downside be of applying this change?
sadly, no one thought of a
when
method for theLazyList
companion, and we can't add one now"what would the signature of that be...?
I'll do you one better, and give you an implementation with it:
def when[A](condition: => Boolean)(deferred: => LazyList[A]): LazyList[A] =
newLL {
if (condition) deferred.state else State.Empty
}
Could mirror Option
and add an unless
too.
Also potentially a useful method for this sort of thing:
/** Defers the computation of a block that returns a `LazyList` so that
* logic outside the `LazyList`'s creation is also lazy.
*/
def deferred[A](thunk: => LazyList[A]): LazyList[A] = newLL(thunk.state)
what would the downside be of applying this change?
The major downside is that it increases the computation unnecessarily for cases where the tail => LazyList
is fully lazy/doesn't contain its own logic. It essentially wraps each tail in an extra layer of laziness that it didn't need, since it was already lazy.
Take for example the classic Fibonacci implementation:
val fibs: LazyList[Int] = 1 #:: 1 #:: fibs.zip(fibs.tail).map { n => n._1 + n._2 }
fibs.zip
(i.e. LazyList#zip
) is fully and properly lazy, so wrapping its state in a separate LazyList
adds nothing. This is done for every single tail as well, which is a significant amount of unnecessary computation for LazyList
s whose elements are already simple to compute (such as Fibonacci numbers).
side note: I really hope the compiler inlines newLL
and sCons
, because they make the code cleaner but don't actually do anything (update: it does)
The test, as written, insists on not wanting the instance of LazyList to be created. But that's too strict. In practice what we want is that the state of the tail hasn't been computed yet.
Where would a regular user learn about what the difference is between a LazyList being created and "its state" being computed? And why it matters and how it's different from what Stream did?
question for @basdebakker:
"from what I read I had expected to use LazyList as a drop-in replacement for Stream"
is there some documentation we ought to improve?
My impression came about from reading the migration guide which states
This allows creating a LazyList without evaluating the head element. immutable.Stream, which has a strict head and a lazy tail, is deprecated.
Reading it again I admit it doesn't explicitly say so, but to me it seemed to mean that LazyList
was "more lazy" than Stream
. So when it turned out that my code became less lazy (as discussed above) I was surprised.
It also states
Stream is replaced with LazyList
without any qualification. If this difference in behavior is considered correct, I think the migration guide could use (a reference to) some more detailed explanation.
The short answer for the differences between Stream
and LazyList
is the following:
s: Stream
, you already know whether it's empty or not, and if it is non-empty, you know the head
. The tail
is computed lazily.l: LazyList
, you don't even know whether it's empty or not (this is the more lazy than Stream
aspect). Once you know whether it is non-empty, then you also know the head
and the tail
(but the tail
itself doesn't necessarily know whether it's empty or not, so this has at least the same power as the Stream
's tail
being lazy).So the LazyList
is definitely more lazy than Stream
for one aspect: knowing whether it's empty and what's it's head is lazy; it's not on Stream
. And although the tail
is not lazy per se, the instance behind the tail
might itself be completely unknown yet, so in terms of the possible states, this is not less lazy than Stream
either.
Of course the code that builds these things might need to be adapted to reap all those possible states.
@sjrd would you mind if I used that as a base for a PR against LazyList's scaladoc (and, for the future lawyers reading along, could you let this comment fall under the contributions you have a CLA for?)
Sure. You can consider all my comments in this thread as Contributions.
Without a lazy constructor, it's actually really hard to give good advice on how to create a lazy tail. head #:: continually(thunk).take(1)
seems really silly.
I don't claim to have fully understood all this, but (like Som) my urge is still to change just #::
while leaving everything else alone.
w/r/t changing #::
, @NthPortal wrote:
This is done for every single tail as well, which is a significant amount of unnecessary computation
I'm not sure this is so bad, given that only code that explicitly uses #::
would pay the price — I assume we aren't using it internally, or if that we are, those internal call sites could be adjusted as needed?
I assume we aren't using it internally
correct - we only use newLL
and sCons
internally
my urge is still to change just
#::
while leaving everything else alone.
I'm starting to feel that too. the core use case for #::
I believe is actually recursive methods, not recursively defined values. and as noted by @basdebakker, most of the time that includes some logic other than the creation of the LazyList
.
I'll try to write a quick benchmark of before and after the extra wrapping
benchmark results for BigInt
fibonacci LazyList
to List
:
Benchmark (takeCount) Mode Cnt Score Error Units
LazyListConsBenchmark.lazier_#:: 10 avgt 20 1.238 ± 0.008 us/op
LazyListConsBenchmark.lazier_#:: 100 avgt 20 12.379 ± 0.107 us/op
LazyListConsBenchmark.lazier_#:: 1000 avgt 20 140.522 ± 0.920 us/op
LazyListConsBenchmark.regular_#:: 10 avgt 20 1.137 ± 0.018 us/op
LazyListConsBenchmark.regular_#:: 100 avgt 20 12.683 ± 0.183 us/op
LazyListConsBenchmark.regular_#:: 1000 avgt 20 132.274 ± 5.249 us/op
it's not clear that the loss from extra laziness is more than noise in this benchmark, and if it is, it doesn't seem super significant to me. I'll throw up a PR
@sjrd would you mind if I used that as a base for a PR against LazyList's scaladoc
@martijnhoekstra that'd be 💯
there should be a way to first create a LazyList with a function that delays the moment where you have to do the if (condition(args))
addressing this might now be possible over at https://github.com/scala/scala-library-next
I am trying to replace usages of the deprecated Stream with the recommended LazyList. But my code is now failing because it turns out that calling
isEmpty
on a LazyList actually forces the tail.reproduction steps
Compile and run this code with Scala 2.13.2:
problem
The output is the following, which shows that the tail argument of the LazyList is called, which is not the case for Stream.
expectation
I expect the behavior of LazyList to be at least as lazy as Stream.