Closed Codas closed 9 years ago
This looks really cool, thank you for your work!
A few nitpickings:
So, after running the benchmark on my machine, it appears that cereal is faster than nom! I have to get back to work and fix that :smile:
Thanks for the quick merge!
It's important to realize that GHC as well as the cereal library matured over many years. Its very impressive what rust and nom can do already. Also, nom seems to handle large files much better. I have not benchmarked it, but I woule expect nom to win for any file larger than maybe 10 MB.
As I mentioned in the readme, my goal with those benchmarks is to see in which performance range nom should be, not to be the fastest (usability is more important and my focus right now).
I looked a lot at attoparsec's source to get design ideas, so I'll probably look at how cereal handles it. If it is not applicable for nom, that could still make it in another library.
To handle large files, the way the parser benchmark is written is not good enough, since all of the file has to be mapped in memory (it is a good way to remove timing differences due to syscals, though). The other version of that mp4 parser in the nom repository supports seeking, and that makes it a lot faster and less memory hungry.
The interesting thing with nom's memory usage is that since it uses slices (a structure containing a pointer and a length) everywhere, and only parses the ftyp box, unneeded data is not even loaded in the process.
This pull request adds a benchmark for a haskell library specialized on parsing binary data without backtracking. Results should be roughly in line with those of the nom library for rust.
Attoparsec results:
Cereal results:
Both were a bit faster at the time I updated the README, but the impovements should be clearly visible.