taocpp / PEGTL

Parsing Expression Grammar Template Library
Boost Software License 1.0
1.94k stars 228 forks source link

star<ws> fails on more than 15 repetitions #90

Closed michael-brade closed 6 years ago

michael-brade commented 6 years ago

Hi, I am currently trying to find out why my seemingly correct grammar won't parse. Here's the first issue that I don't understand: take the JSON grammar as an example and start with 16 spaces:

             {
}

It fails with:

     1      1 source:1:0(0)  start  tao::pegtl::disable<tao::pegtl::json::text>
     2      2 source:1:0(0)  start  tao::pegtl::json::text
     3      3 source:1:0(0)  start  tao::pegtl::star<tao::pegtl::json::ws>
     4      4 source:1:0(0)  start  tao::pegtl::json::ws
     5      4 source:1:0(0) failure tao::pegtl::json::ws
     6      3 source:1:0(0) success tao::pegtl::star<tao::pegtl::json::ws>
     7      5 source:1:0(0)  start  tao::pegtl::json::value
     8      6 source:1:0(0)  start  tao::pegtl::sor<tao::pegtl::json::string, tao::pegtl::json::number, tao::pegtl::json::object, tao::pegtl::json::array, tao::pegtl::json::false_, tao::pegtl::json::true_, tao::pegtl::json::null>
     9      7 source:1:0(0)  start  tao::pegtl::json::string
    10      8 source:1:0(0)  start  tao::pegtl::ascii::one<(char)34>
    11      8 source:1:0(0) failure tao::pegtl::ascii::one<(char)34>
    12      7 source:1:0(0) failure tao::pegtl::json::string
    13      9 source:1:0(0)  start  tao::pegtl::json::number
    14     10 source:1:0(0)  start  tao::pegtl::opt<tao::pegtl::ascii::one<(char)45> >
    15     11 source:1:0(0)  start  tao::pegtl::ascii::one<(char)45>
    16     11 source:1:0(0) failure tao::pegtl::ascii::one<(char)45>
    17     10 source:1:0(0) success tao::pegtl::opt<tao::pegtl::ascii::one<(char)45> >
    18     12 source:1:0(0)  start  tao::pegtl::json::int_
    19     13 source:1:0(0)  start  tao::pegtl::ascii::one<(char)48>
    20     13 source:1:0(0) failure tao::pegtl::ascii::one<(char)48>
    21     14 source:1:0(0)  start  tao::pegtl::json::digits
    22     15 source:1:0(0)  start  tao::pegtl::abnf::DIGIT
    23     15 source:1:0(0) failure tao::pegtl::abnf::DIGIT
    24     14 source:1:0(0) failure tao::pegtl::json::digits
    25     12 source:1:0(0) failure tao::pegtl::json::int_
    26      9 source:1:0(0) failure tao::pegtl::json::number
    27     16 source:1:0(0)  start  tao::pegtl::json::object
    28     17 source:1:0(0)  start  tao::pegtl::json::begin_object
    29     18 source:1:0(0)  start  tao::pegtl::ascii::one<(char)123>
    30     18 source:1:0(0) failure tao::pegtl::ascii::one<(char)123>
    31     17 source:1:0(0) failure tao::pegtl::json::begin_object
    32     16 source:1:0(0) failure tao::pegtl::json::object
    33     19 source:1:0(0)  start  tao::pegtl::json::array
    34     20 source:1:0(0)  start  tao::pegtl::json::begin_array
    35     21 source:1:0(0)  start  tao::pegtl::ascii::one<(char)91>
    36     21 source:1:0(0) failure tao::pegtl::ascii::one<(char)91>
    37     20 source:1:0(0) failure tao::pegtl::json::begin_array
    38     19 source:1:0(0) failure tao::pegtl::json::array
    39     22 source:1:0(0)  start  tao::pegtl::json::false_
    40     22 source:1:0(0) failure tao::pegtl::json::false_
    41     23 source:1:0(0)  start  tao::pegtl::json::true_
    42     23 source:1:0(0) failure tao::pegtl::json::true_
    43     24 source:1:0(0)  start  tao::pegtl::json::null
    44     24 source:1:0(0) failure tao::pegtl::json::null
    45      6 source:1:0(0) failure tao::pegtl::sor<tao::pegtl::json::string, tao::pegtl::json::number, tao::pegtl::json::object, tao::pegtl::json::array, tao::pegtl::json::false_, tao::pegtl::json::true_, tao::pegtl::json::null>
    46      5 source:1:0(0) failure tao::pegtl::json::value
    47      2 source:1:0(0) failure tao::pegtl::json::text
    48      1 source:1:0(0) failure tao::pegtl::disable<tao::pegtl::json::text>

Note 5 4 source:1:0(0) failure tao::pegtl::json::ws: it fails already at the first space! If you delete one space, it works.

michael-brade commented 6 years ago

Ok, forget it. It seems it was my fault: passing QPlainTextEdit::toPlainText().toStdString() directly to pegtl::memory_input creates a temporary that is destroyed too soon. Why it works with up to 15 spaces I have no idea. Just for the record, the correct and working code is:

QByteArray data = source->toPlainText().toUtf8();
pegtl::memory_input module(data.data(), "source");

The QByteArray now holds the data long enough and is not a temporary anymore.

d-frey commented 6 years ago

Interesting. Well, it's a bit unfortunate that memory_input accepts an rvalue std::string, I guess I'll add a deleted overload to prevent that in the future.

What you probably wanted is a string_input which copies/moves its argument into itself. Anyways, you figured out the root cause so you already have a solution. :)

michael-brade commented 6 years ago

Yes exactly! I also had that idea - taking a const std::string & and copying the .data() pointer out of it is kind of mean :)