amnh / PCG

𝙋𝙝𝙮𝙡𝙤𝙜𝙚𝙣𝙚𝙩𝙞𝙘 𝘾𝙤𝙢𝙥𝙤𝙣𝙚𝙣𝙩 𝙂𝙧𝙖𝙥𝙝 ⸺ Haskell program and libraries for general phylogenetic graph search
28 stars 1 forks source link

Investigate runtime regression #101

Closed recursion-ninja closed 5 years ago

recursion-ninja commented 5 years ago

One of these issues (#97, #98, #99 ) caused the following test to run 20 times longer:

datasets/dynamic/multi-block/dna/arthropods.pcg:

recursion-ninja commented 5 years ago

This has been resolved. See 17e710d662816b8882e79a4f91425e35232f2430 & 3a385c083949cf72bd92cbbd4c3d7bb2924839a0.

The issue was that when the discrete metric or L1 norm were specified as the dynamic character metric, that the metadata would not allocate a "dense TCM" if the alphabet was small (less than 9). It would instead prefer to perform all calculations in Haskell with the specialized TCM functions. The problem is that this is much slower than atomically aligning the dynamic characters across the FFI in C .

We now bias toward string alignment in C if the alphabet is small, and native string alignment if the alphabet is large. In the case that the alphabet is large, we use either the specialized TCM functions or the memoized TCM when performing the native string alignment.