Open Daniel-Mietchen opened 4 years ago
One way to go about this would be to allow for certain non-text characters (e.g. something like %22
) to be present in the input and remain, i.e. not being stripped away, much like dashes are already retained today.
I been thinking about newline as a possible hack for tokenizing
This would allow to better capture more complex constructs like matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (Q1792222).
Ideally, the user could set lower and upper bounds for N.