Open axman6 opened 6 years ago
For cases where you statically know the size of the output like parsing there's a trick in the library. See the json parsing:
It's pretty fast as you can see from the benchmarks :)
It would be good to have some way to allocate the correct sized array when constructing records in the syntax users are expected to use: #foo := 7 & #bar := True & rnil
(I haven't figured out how to do this though, while keeping the same or similar syntax).
A possible way to do this would be to track the currently allocated size in the record itself, and then when adding a new field check if there's enough space - if yes just add it - if no reallocate and copy. Problem here is that it will break referential transparency so it will probably either need to be wrapped in a monad and/or marked with unsafe
in all the places.
You might be able to use the IsList
class with -XOverloadedLists
. It provides a fromListN
method which I believe is used in the desugaring of list literals, and of course gives access to list syntax for record construction.
This is a deal-breaker for this library, because it slows down typechecking time even for relatively small records, which leads to a bad feedback loop even with haskell-language-server.
I see a noticable slowdown with records <10 elements, and it becomes unusable at 15–20.
Is it maybe related to the sorting of records? How would one even start debugging this?
possible; afaik the issue is that the type level induction causes quadratic core code. so probably best to get GHC to dump core code and take a look what is being generated.
Unless I'm misunderstanding something, it appears that record creation is
O(n^2)
when we have all the information needed at compile time to statically allocate aSmallMutableArray#
of known size right away and write all the elements into it, and avoid all the copying. Given one of the most appealing parts of this library is theO(1)
field access even for large records it seems wasteful to make record creation so expensive (relatively). I'm imagining cases where I might be parsing a CSV file with millions of rows and thatO(n^2)
is going to add up quickly!