Closed fryguybob closed 12 years ago
(Imported. Original comment by felipe.l...@gmail.com on November 2, 2011, 06:33:47 PM UTC)
Perhaps the problem is with (|||), which behaves has these worst cases. But fixing hcat may be easier, and maybe even profitable on a better world where (|||) too has better performance.
(Imported. Original comment by byor...@gmail.com on November 3, 2011, 02:35:31 AM UTC)
Thanks for the report! And nice dendrogram package. =)
I've changed the implementation of cat' to use a balanced fold so now cat', cat, hcat, and vcat are all much faster (running bug.hs with 'hcat' is down to 0.5s / 7MB on my machine, the same time+memory as 'foldTree'). I also improved the implementation of 'beside' (and hence of (===) and (|||)), which brought the running time of 'foldr1' on my machine down from about 30s to 15s.
Wed Nov 2 22:12:46 EDT 2011 Brent Yorgey byorgey@cis.upenn.edu
D.Combinators: implement cat' using a balanced folding scheme (fixes #57)
Thanks to Felipe Lessa for the idea. The usual linear fold was O(n^2) because appending each new diagram to the accumulated list has to query its bounding function, which is built up as the iterated maximum of the O(n) bounding functions of the component diagrams. Doing a balanced fold is thus only O(n log n) (and seems to give a big speedup in practice). There is also probably more room for optimizing computation of bounding functions (using memoization and/or retaining a bounding box for the common special cases of querying in a direction parallel to an axis) but that can be left to future work.
The new implementation is also considerably prettier and shorter than the old, so it's a win all around. =)
Wed Nov 2 17:43:21 EDT 2011 Brent Yorgey byorgey@cis.upenn.edu
(Imported. Original comment by felipe.l...@gmail.com on November 3, 2011, 02:46:27 AM UTC)
Thanks a lot for being so fast on fixing this issue! =)
(Imported from http://code.google.com/p/diagrams/issues/detail?id=57. Original issue from felipe.l...@gmail.com on November 2, 2011, 06:27:13 PM UTC)
Hello!
I'm sorry that I didn't do my homework to investigate the "why", but I've included a very simple test case. You may compile and run the test as follows:
$ ghc -O --make -fforce-recomp -rtsopts bug.hs $ ./bug -o t.png --selection=??? +RTS -s
where ??? may be "hcat", "foldl1", "foldr1" or "foldTree". The program just concatenates 1,000 (rect 1 1)s. On my computer, I get the following:
hcat: 23s, 130 MiB foldr1: 30s, 69 MiB foldl1: 27s, 62 MiB foldTree: 1s, 7 MiB
Concatenating one thousand unit squares shouldn't take more than 20 seconds =). This is a showstopper for me, so I've reimplemented hcat (see hcatB at the end of [1])
Cheers!
[1] https://patch-tag.com/r/felipe/hierarchical-clustering-diagrams/snapshot/hash/20111102171621-0b5ec-1f98cd6a796d46b46f54dc2e168e826f459572ea/content/pretty/src/Diagrams/Dendrogram.hs