Open noughtmare opened 5 years ago
@emmericp
Nice, this looks like it should do a bit better.
Would be good to change the GC parameters as well (at least a bigger -A and -H), as the GHC defaults are extremely conservative.
I have made some more changes to make the code even more performant: https://github.com/noughtmare/ixy.hs/commit/2c24c5cc7a58df1bbd689b6b9b19e6f71b1abefe. But that changes a lot about the code. It replaces all IOUArray
s with unboxed vectors and uses unboxed IORef
s instead of the standard boxed ones. And it uses a storable vector instead of the standard list for the send and receive functions. That makes the code uglier (there are ways to make it prettier again, but I didn't do that yet), so I haven't pushed it to this branch.
Please test the performance of this.
Currently I've tried to optimize the
receive
function. I might look atsend
later.The correctness should still be checked before this can be merged.
This pull request includes:
unsafeRxMap
andunsafeRxGetMapping
).rxqDescriptor
function from theRxQueue
record and write it as a separate inlinable function.Now the
go
loop inside thereceive
function uses unboxed operations almost exclusively, which means that no garbage gets generated. The only non-unboxed operations are due toIORef
s. That might be a further opportunity for optimization. There is also allocation for thebufPtr
that is consed to thebufs
list, so using a mutable vector for that would probably also improve performance.