Closed martinklepsch closed 5 years ago
FWIW this seems to fix the issue, not sure why yet though (note the double pr-str
)
diff --git a/boot/pod/src/boot/pod.clj b/boot/pod/src/boot/pod.clj
index b4f58f6..6903737 100644
--- a/boot/pod/src/boot/pod.clj
+++ b/boot/pod/src/boot/pod.clj
@@ -465,7 +473,9 @@
([expr]
(let [{:keys [meta? expr]} (read-string expr)]
(binding [*print-meta* meta?]
- (pr-str (eval expr)))))
+ (let [r (eval expr)]
+ (pr-str r)
+ (pr-str r)))))
([pod expr]
(let [arg (pr-str {:meta? *print-meta* :expr expr})
ret (with-invoke-in pod (boot.pod/eval-in* arg))]
Laziness seems to be at fault here. Using mapv
in the example below solves the issue but this doesn't really help when lazy seqs are nested deep inside maps etc.
(->> (range 30)
(map #(do (println %) %))
(pr-str))
(->> (range 30)
(map #(do (println %) %))
(clojure.walk/prewalk identity)
(pr-str))
This works — does anyone see issues with this approach?
@martinklepsch weird case but makes sense to me. Lazy things need to be fully realized 👍
The example / test case is a bit contrived but I was working with Codox and it prints whenever it excludes a var from the documentation. Generation of the data that drives the documentation happens lazily. The boot-codox
task is not affected by this because it never returns any data but just writes files.
Ok now that I understand the problem better I think I have one other idea.
I agree lazy things should be realized for side effects to happen but do we want to modify existing behavior?
We could instead expose a new API like do-all-in
or something for the pod and this includes the change in behavior of the PR.
I agree that this is strictly speaking a breaking change but pr-str
also realizes the entire data. So you could say we only move that realization step out of pr-str. Can you think of cases where the two variants would give different results?
Do we think people depend on intermixing of strings printed during the realization of the seq? FWIW I'm hesistant to add another function to that ns because:
So if pr-str
already realizes everything there wouldn't be any problem right?
I am wondering what is the missing case, what pr-str
does not realize and in case we identify the why maybe there is a better approach. I am just afraid we will break things that we don't expect to break...but well..maybe I am too on the defending/scared side :smile:
So if pr-str already realizes everything there wouldn't be any problem right?
pr-str
realizes everything by recursively walking the data printing item by item. If the generation of any next
items in a lazy seq cause println
s those will be intermixed with the printed stuff of pr
.
If pr-str
would not realize the the entire data it would not work very well 😛
Ok it seems like I cannot understand something about the issue still, or maybe I don't see it as boot issue per se:
(pr-str (eval '(map #(do (println (str "test" %) %)) (range 0 6))))
And it prints out the testX
within the returned vector. Is this the problem? You want to receive all the println
first and then the list of results?
The alternative (or mapv
as you said):
(pr-str (eval '(doall (map #(do (println (str "test" %)) %) (range 0 6)))))
Achieves that. Maybe the debate in my own head is something else: if we did that change users won't be allowed to control the behavior of lazy seqs anymore. You could not do my case number one basically.
Is case 1 common? For sure it is the existing behavior. Do we want to enforce 2 behavior? I don't actually know :smile: I would be cautious and introduce another API, however I will yield to the majority if the change seems innocuous enough.
And it prints out the testX within the returned vector. Is this the problem?
Yes
You want to receive all the println first and then the list of results?
I personally don't care about execution order, I do care about getting the unmodified return value of my functions though 🙂
If I understand you correctly: Users that rely on case 1 rely on the fact that pr-str
will mix values with printed stuff. I don't see any reason why you would realistically want that but you are absolutely right, if people rely on this it is a breaking change.
Saw the comment on slack @martinklepsch . My two cents: Is this performance critical? A super easy hack is to just pr-str
twice and take the second one. Edit: Ooops, I now see that this was already considered. Excuse the noise :)
@rauhs That's the first "solution" I found but I feel like clojure.walk
is fundamentally doing the same but without actually turning things into strings which is probably faster? Perf is desirable if possible.
This definitely seems like what doall
was created to avoid.
@minikomi IIRC the issue with doall
is that it doesn't recursively walk structures. That means calling doall
on a map won't fully realize lazy sequences which are part of the map vals.
@martinklepsh do you have a patch for this?
@flyboarder not ready but I might be able to provide one later.
The gist of it is that we need to call (clojure.walk/prewalk identity (eval expr))
in the snippet mentioned in this comment: https://github.com/boot-clj/boot/issues/683#issuecomment-357938537
By fully realising all nested lazy sequences we ensure that no side effecting (println
) functions mapped over those lists affect the output of the pr-str
that is called later.
To reproduce: copy the above into a file
repro
, make it executablechmod +x repro
, run it./repro
.I reproduced this issue with
2.6.0
,2.7.2
and2.8.0-SNAPSHOT
.EDIT updated with more minimal repro. The key thing seems to be that something is printed during iteration over a sequence.