Adds another 20%ish performance improvement for encoding.
I found the encoder does not need to compute the disjoint set or collect edges. Instead, we can scan through the adjacency graph looking for clusters, which is a memory aligned scan.
I also replaced 4 if statements in the symbol to codepoint encoder with a lookup table to reduce branch prediction errors.
Adds another 20%ish performance improvement for encoding.
I found the encoder does not need to compute the disjoint set or collect edges. Instead, we can scan through the adjacency graph looking for clusters, which is a memory aligned scan.
I also replaced 4 if statements in the symbol to codepoint encoder with a lookup table to reduce branch prediction errors.