Closed tlemo closed 2 years ago
Good point. Love the feedback to improve the project.
Lexer input is incrementally copied to a mutable buffer with a fixed length. The buffer must be mutable, because text()
(and yytext
) places a \0
in the buffer. The source input (e.g. a string) is just specified a (persistent) source of input that is never modified by the scanner (or copied). If it is a temporary string, then the string will be lost before it can be scanned, which is the problem here.
I like the suggestion to use Input(std::string&&) = delete
, but which is not portable to classic C++, which made me reluctant to use it in the past for this project, although we can add a #if __cplusplus >= 201103L
check I suppose.
FYI. A mutable buffer can be specified for the lexer to scan with the buffer(source, length + 1)
method, same as Flex yy_scan_buffer(source, length + 2)
.
The generated lexer source will have additional comments to clarify the input/output, for example:
class yyFlexLexer : public FlexLexer {
public:
yyFlexLexer(
// a persistent source of input, empty by default
const reflex::Input& input = reflex::Input(),
// optional output stream, NULL means std::cout by default
std::ostream *os = NULL)
:
FlexLexer(input, os)
{
}
I've already updated the online documentation section that describes sources of input to the lexer, by adding a note: https://www.genivia.com/doc/reflex/html/index.html#reflex-input
I've already updated the online documentation section that describes sources of input to the lexer, by adding a note: https://www.genivia.com/doc/reflex/html/index.html#reflex-input
The note looks good to me, thanks for the quick response!
I just discovered, the hard way, that
reflex::Input
doesn't make a copy of the source text. This, by itself, may be a valid design choice, but it doesn't seem to be documented and can lead to pain and suffering in seemingly simple use cases, for example:A conspicuous note about the content lifetime constraints would help new users avoid this pitfall.
Also, a
Input(std::string&&) = delete;
constructor might catch some of the common, although not all, cases.