zeusdeux / re2

Automatically exported from code.google.com/p/re2
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

bug: SearchBitState inconsistency #57

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Match this regular expression against an a C++ header with multiple classes: 
^class(?:\s+)([a-zA-Z0-9_]+)(?:\s+)\{(.+?)^\};

What is the expected output? 
Expectation is FindAndConsume fills two strings for each class in the header.

What do you see instead?
fatal: re/re2.cc:729 SearchBitState inconsistency

What version of the product are you using? On what operating system?
I'm using the latest on OS X.

When I try to match this particular regex, I receive SearchBitState 
inconsistency message. I love using re2 since it's so fast, but I've checked 
others and this regex works with them. As far as I can tell, I'm not doing 
anything that re2 can't support, but I'm happy to be corrected.

I've attached a main.c.

Here is relevant code:

void parse_file(const char *path) {
    ifstream ifile(path);
    if (ifile) {
        stringstream ss;
        ss << ifile.rdbuf();
        StringPiece sp_contents(ss.str());

        //get the classes
        const RE2 re2_class_regex("(?ms)^class(?:\\s+)([a-zA-Z0-9_]+)(?:\\s+)\\{(.+?)^\\};");
        string str_class_name;
        string str_class_contents;
        bool match_class = true;
        while (match_class) {
            match_class = RE2::FindAndConsume(&sp_contents, re2_class_regex, &str_class_name);
            if (match_class) {
                cout << str_class_name << endl;
            }
        }
    }

}

always outputs: "re2/re2.cc:729: SearchBitState inconsistency"

Original issue reported on code.google.com by oliver.w...@gmail.com on 10 Jan 2012 at 10:28

Attachments:

GoogleCodeExporter commented 9 years ago

Original comment by rsc@golang.org on 10 Jan 2014 at 3:26

GoogleCodeExporter commented 9 years ago
There is a bug in your program. 

        StringPiece sp_contents(ss.str());

ss.str() returns a temporary string, which must be copied in order to be 
preserved. StringPiece records a pointer into that string but then it is 
deallocated. The inconsistency happens because the data is being changed by 
something else (it has been reallocated) while the search runs.

        string s = ss.str();
        StringPiece sp_contents(s);

will work much better.

Original comment by rsc@golang.org on 10 Jan 2014 at 3:52