Open xsebek opened 1 month ago
The same thing happens with the sliding puzzle solution:
swarm --scenario data/scenarios/Challenges/Sliding\ Puzzles/3x3.yaml --autoplay
For comparison, here are the number of lines of the examples:
for f in example/*; do wc -l $f; done | sort -n
5 example/omega.sw
9 example/maybe.sw
12 example/wander.sw
13 example/dfs.sw
13 example/fact.sw
20 example/multi-key-handler.sw
24 example/cat.sw
26 example/pilotmode.sw
66 example/BFS-clear.sw
107 example/rectypes.sw
305 example/list.sw
And here is the length of the output JSON - measured in lines of the indented human readable JSON:
curl -s localhost:5357/robot/0 | jq '.program' | wc -l
346 swarm --scenario blank --run example/omega.sw
12 swarm --scenario blank --run example/maybe.sw
146 swarm --scenario blank --run example/wander.sw
1699 swarm --scenario blank --run example/dfs.sw
1223 swarm --scenario blank --run example/fact.sw
4334 swarm --scenario blank --run example/multi-key-handler.sw
5838 swarm --scenario blank --run example/cat.sw
12 swarm --scenario blank --run example/pilotmode.sw
125887 swarm --scenario blank --run example/BFS-clear.sw
1062053 swarm --scenario blank --run example/rectypes.sw
??? swarm --scenario blank --run example/list.sw
Maybe we can analyze the rectypes.sw
to see what is going on:
swarm --scenario blank --run example/rectypes.sw
curl -s localhost:5357/robot/0 | yq -P '.program' | sed 's/^ *//;s/ *$//' | sort | uniq -c | sort -n
These 3000 repetitions of source positions stand out to me:
3026 - 571
3026 - 575
3026 - 582
Looking in the file the character positions 571, 575 and 582 would correspond to:
def cons : a -> List a -> List a = \x. \l. inr (x, l) end
^ ^ ^
So cons
gets repeated a lot. @byorgey do you have any idea how that would happen?
Yes, I'm pretty sure I know exactly why this is happening, see https://github.com/swarm-game/swarm/issues/1907#issuecomment-2153036505 . I don't think there's anything about cons
in particular, you just happened to notice that one.
Many continuation stack frames in a robot's CESK machine contain an Env
, and they have a lot of shared entries (e.g. in a context with 20 definitions, if you process one more def
you now have the same 20 definitions still in scope plus one more). This is not usually a problem in memory, because the shared entries are literally shared: all the Env
values just contain pointers. But serializing loses all the sharing.
To solve this we will have to either (1) recover the sharing when serializing somehow, or (2) store things in the first place that makes the sharing more explicit.
@byorgey in the linked issue, I meanwhile arrived to the idea (1). 😄
I would very much like (2) but it sounds like a big rewrite. Though it might be necessary if we want to serialize/deserialize robots and (1) turns out to not work. 🤔
I mean, this is also going to be a critical component to #50 , so (2) could be worth it. I think (1) will work, I am just worried about making it efficient.
Describe the bug
Trying to get base log after
run "example/list.sw"
does not terminate - at least not in any reasonable time.Thats because the output is flooded with infinite (or exponential) stream of JSON
Syntax
:To Reproduce
Run swarm:
Get the output using cURL - I suggest piping to
wc
for 60 seconds. Optionally observe the process memory usage withps
,top
, or an alternative likebtop
.Converting to gigabytes:
robot/0
outputted 0.874 GB of JSONExpected behavior
I wanted to get base log, i.e. this command should work:
Unfortunately, there are gigabytes of JSON of swarm syntax before the log.
Ideally, the syntax
ToJSON
should be usable, but I could use workarounds likerobot/0/log
to get the log directly or maybe if it was placed before the syntax, I could extract it and stop cURL output.Screenshots
Additional context
Given that the list example was only parsed and the functions were not run, this issue likely affects other solutions as well.
The other example files did not cause this magnitude of syntax output, but it's possible some other solutions would.