zeek / spicy

C++ parser generator for dissecting protocols & files.
https://docs.zeek.org/projects/spicy
Other
247 stars 37 forks source link

Recursive type alias segfaults #1867

Open evantypanski opened 1 week ago

evantypanski commented 1 week ago

The following code segfaults during compilation via spicyc:

module segfault;

type Data = Data;

Like:

etyp:/Users/etyp/src/tests/spicy $ spicyc -c segfault-recursive.spicy
[1]    16815 segmentation fault  spicyc -c segfault-recursive.spicy

Offending line just loops infinitely trying to resolve (unsurprisingly)

I think there's merit in having this be possible, but it can also be a difficult problem sometimes, so just not segfaulting would work too.

EDIT: To be clear, that minimal case above doesn't seem useful, but this kind of type may be:

# This still segfaults
type Data = tuple<
    # Some random fields...
    optional<real>,
    optional<int64>,
    optional<bytes>,
    # Or a vector!
    optional<vector<Data>>,
>;
rsmmr commented 4 days ago

I believe we should actually reject both cases, and more generally such self-recursive types. A type defined through itself seems ill-defined in our type system, and cannot be expressed by the generated C++ code either. The way to make that work would be a reference:

type Data = tuple<
     ...
    optional<vector<Data&>>,
>;
evantypanski commented 3 days ago

With the reference (exactly the provided example) also seems to segfault

FWIW, C++17 (I think) does allow incomplete types in vectors:

#include <iostream>
#include <vector>

struct Data {
  std::vector<Data> v;
  int i = 0;
};

int main() {
  Data dat;
  dat.v.push_back({.i = 5});
  dat.v.push_back(dat);

  for (auto &inner : dat.v) {
    std::cout << inner.i << std::endl; // prints 5 then 0
  }
}

but that may be hard to replicate, I don't know

rsmmr commented 3 days ago

With the reference (exactly the provided example) also seems to segfault

Darn. :-) Now we definitely have a bug.

FWIW, C++17 (I think) does allow incomplete types in vectors:

Interesting, I never realized that C++17 added that. But yeah, I don't immediately see how we'd use that, seems the standard supports only vectors and lists for this. However, if there was a way to make this work for our containers and structs, then that could have quite an impact on our generated code because many instances of our ValueReference wrapping would no longer be needed; also see #1781, which addresses the same thing (though there are also new problems with putting more stuff on the heap: it has a performance impact on ours fibers)

bbannier commented 3 days ago

I think @timwoj reported some issues around incomplete types in Spicy when he tried to bump Zeek to C++20, I wonder (but didn't check) whether that was due to how we emit defaulted functions, see e.g., https://www.lukas-barth.net/blog/cpp20-vector-incomplete/#sec-speculation.