I was able to optimize that to 0.68 seconds. My optimisation happened to read it into a lazy ByteString and then stream out the toChunks, which is ugly and I don't recommend. (I was actually just investigating how to stream a lazy ByteString through servant when I noticed this performance problem.)
Probably the right fix is to change 4096 to some more optimal chunk size. Data.ByteString.Lazy.Internal.defaultChunkSize is a good one.
4096 is not the most optimal chunk size, and so this function is slower at reading files than it could be.
In a quick benchmark, using the example from https://docs.servant.dev/en/stable/cookbook/basic-streaming/Streaming.html, and a 1 gb "README.md" file, curl to /dev/null takes 1.84 seconds.
I was able to optimize that to 0.68 seconds. My optimisation happened to read it into a lazy ByteString and then stream out the toChunks, which is ugly and I don't recommend. (I was actually just investigating how to stream a lazy ByteString through servant when I noticed this performance problem.)
Probably the right fix is to change 4096 to some more optimal chunk size. Data.ByteString.Lazy.Internal.defaultChunkSize is a good one.