Currently the tokenizer is a simple for loop over the input string. This is simple to implement and understand, but it leaves some performance on the table. Vectorization (eg SIMD) can be used to more quickly search the input string for noteworthy symbols (open and closed brackets).
MemoryExtensions.IndexOfAny() has been internally vectorized in .NET 5. If the tokenizer is rewritten to use IndexOfAny() to jump forwards to the next noteworthy token, it can jump over strides of text much more quickly.
Currently the tokenizer is a simple for loop over the input string. This is simple to implement and understand, but it leaves some performance on the table. Vectorization (eg SIMD) can be used to more quickly search the input string for noteworthy symbols (open and closed brackets).
MemoryExtensions.IndexOfAny()
has been internally vectorized in .NET 5. If the tokenizer is rewritten to useIndexOfAny()
to jump forwards to the next noteworthy token, it can jump over strides of text much more quickly.