Genivia / RE-flex

A high-performance C++ regex library and lexical analyzer generator with Unicode support. Extends Flex++ with Unicode support, indent/dedent anchors, lazy quantifiers, functions for lex and syntax error reporting and more. Seamlessly integrates with Bison and other parsers.
https://www.genivia.com/doc/reflex/html
BSD 3-Clause "New" or "Revised" License
522 stars 85 forks source link

Not compatible with flex #127

Closed ryjer closed 2 years ago

ryjer commented 2 years ago

This is the version

$ reflex -V
reflex 3.1.0 x86_64-pc-linux-gnu
License BSD-3-Clause: <https://opensource.org/licenses/BSD-3-Clause>
Written by Robert van Engelen and others: <https://github.com/Genivia/RE-flex>
$ flex -V
flex 2.6.4

For this file

%{
#include <string.h>

int yywrap(void)
{
    return 1;
}

%}

%%
.   {printf("hello world!");}

flex works well but reflex echoes an error. (This file has 13 rows in total, there is a blank line at the end but github not show.)

$ reflex test.l 
test.l:12: error: EOF encountered inside an action

It sames need a pair of %% for reflex.

%{
#include <string.h>

int yywrap(void)
{
    return 1;
}

%}

%%
.   {printf("hello world!");}

%%
ryjer commented 2 years ago

There is a blank line at the end image or the flex echoes the EOF error

$ flex test.l 
test.l:12: EOF encountered inside an action
genivia-inc commented 2 years ago

A blank line in the rules is fine, as long as there is an ending %%.

genivia-inc commented 2 years ago

Also, there are some (not many) not-so-well documented features of Flex that RE/flex replicates faithfully. However, bad undocumented (or under documented) features are not replicated because these often cause confusion or result in compilation errors. There are only one or two such cases that I recall are deemed bad.

A missing ending %% is not a good feature of Flex nor a bad feature of Flex, but an ugly feature of Flex. It fits between these criteria for RE/Flex to replicate or not. I may replicate it in a future release perhaps.

genivia-inc commented 2 years ago

I should add that all Flex features are replicated, but there is at least one bad situation where the interpretation of the Lex/Flex specification is ambiguous when using empty lines or other forms of spacing/indentation. If it's not clearly documented in Flex and the situation is ambiguous then I left it out of reflex. This is what I mean by a "bad" feature.

genivia-inc commented 2 years ago

The second %% can be omitted in the latest v3.2.2 update.