amnh / PCG

𝙋𝙝𝙮𝙡𝙤𝙜𝙚𝙣𝙚𝙩𝙞𝙘 𝘾𝙤𝙢𝙥𝙤𝙣𝙚𝙣𝙩 𝙂𝙧𝙖𝙥𝙝 ⸺ Haskell program and libraries for general phylogenetic graph search
28 stars 1 forks source link

Add a benchmarking suite for the file parsers #139

Closed recursion-ninja closed 5 years ago

recursion-ninja commented 5 years ago

We should ensure that the parsers are performing well and do not allocate excessive space when parsing an input stream.

Starting with benchmarking suites for the FASTA & FASTC parsers seems best, as they are the simplest. We can explore what the benchmarking approach works best for file parsers with these simpler ones before considering TNT or Nexus.

We can perhaps start from the template used by megaparsec for benchmarking.

recursion-ninja commented 5 years ago

Benchmarks have been added for the FASTA, FASTC, and Newick parsers. After some parser code and data-structure tuning from benchmark feedback, space and time usage are up to four times as efficient depending on the parser and input stream.

Adding a benchmarking suite for the TCM and VER parsers should be straight forward.

Benchmarking TNT and Nexus are beyond the scope of this issue.

recursion-ninja commented 5 years ago

There are now benchmarking suites for the following file parsers included in pcg-file-parsers:

Additionally, the output types and parsering functionality of many parsers have been tuned to reduce time & space usage (more space than time).

For more details, see 0e260a6 through 1491e74.