Closed lyin-vir closed 2 years ago
Thanks @lyin-vir for the question.
If I pass the output directly to another task, it works fine.
Yep, this is a common way to approach the issue.
I'm slightly confused since the task
double
is doing essentially the same thing as the code inmain
which is also a redun task itself. My guess is that it has something to do with the lazy evaluation.
Your understanding is correct. data
in main()
is lazy (type TaskExpression
) and we don't know how long its going to be. Therefore, an eager statement like the for-loop in the list comprehension will fail.
My main concern is that if the code gets more complicated, I either have to encapsulate lots of logic into one giant task, or break the code down into tiny tasks whenever I had to iterate on the previous result. If this is intended, could you point me in the right direction?
We find it is common to break tasks down into small units so that is a perfectly reasonable way to design a workflow.
One more tip in case it helps, is if you have a case where you just want to just map over the lazy list, you can do that using redun.functools.map_
. Here is what it would look like in your example:
from redun import task
from redun.functools import map_
redun_namespace = "redun.example.test"
@task()
def get_list():
return [1, 2, 3, 4, 5]
@task()
def double(x):
return 2 * x
@task()
def main():
data = get_list()
return map_(double, data)
This has the added benefit of running the double
s in parallel.
I hope this helps.
@mattrasmus Thank you very much for the clarification.
Hi there,
Thanks for creating this framework, I'm just starting to learn it. Here's an issue I'm often stuck with: when I try to consume the output from a task (e.g.: iterate through the list/dict output), I'm always faced with this error:
[TypeError: Expressions of unknown length cannot be iterated]
.If I pass the output directly to another task, it works fine. I'm slightly confused since the task
double
is doing essentially the same thing as the code inmain
which is also a redun task itself. My guess is that it has something to do with the lazy evaluation. My main concern is that if the code gets more complicated, I either have to encapsulate lots of logic into one giant task, or break the code down into tiny tasks whenever I had to iterate on the previous result. If this is intended, could you point me in the right direction?Here's a minimal repro:
Thank you!