Following the work on issue #300, it seems plausible that other situations will arise in which the default tokenization regex is not appropriate for a particular linguistic or historical context. We should consider allowing users to define their tokenization regex in the config file.
Following the work on issue #300, it seems plausible that other situations will arise in which the default tokenization regex is not appropriate for a particular linguistic or historical context. We should consider allowing users to define their tokenization regex in the config file.