I propose adding a repeatable option string(Type, SepChar).
This would parse string-like-things, properly handling escapes using the SWI-Prolog \ escape conventions, which are pretty universal. Type would become the functor of the token emitted when a sequence SepChar ... sequence of other characters, possibly with escapes ... SepChar was encounted. The contents would be a SWI-Prolog string.
Typical usage:
string(double_quote, 0'")
Then the text "hello \"Bob\", I see we all have fake ID today." would become
double_quote("hello \"Bob\", I see we all have fake ID today.")
This seems an adequate compromise between flexibility and complexity for handling the often awkward problem of quoted strings in language tokenizing. String recognition is a task that traditionally fits with tokenizing, rather than parsing.
I propose adding a repeatable option
string(Type, SepChar).
This would parse string-like-things, properly handling escapes using the SWI-Prolog \ escape conventions, which are pretty universal. Type would become the functor of the token emitted when a sequence SepChar ... sequence of other characters, possibly with escapes ... SepChar was encounted. The contents would be a SWI-Prolog string.
Typical usage:
This seems an adequate compromise between flexibility and complexity for handling the often awkward problem of quoted strings in language tokenizing. String recognition is a task that traditionally fits with tokenizing, rather than parsing.