kaitai-io / kaitai_struct

Kaitai Struct: declarative language to generate binary data parsers in C++ / C# / Go / Java / JavaScript / Lua / Nim / Perl / PHP / Python / Ruby
https://kaitai.io
3.96k stars 192 forks source link

Replace _parent and _root with custom parameters #770

Open generalmimon opened 4 years ago

generalmimon commented 4 years ago

I would like to additionally mention something I just recognized in my use case, because I have 4 types in 2 KSY files currently using each other:

Type1.ksy -> Type1.1 -> Type1.1.1
          -> Type2
Type2.ksy -> Type1.1 -> Type1.1.1

Type1.ksy contains and uses all subtypes starting with 1., while Type2.ksy doesn't contain any subtypes, but uses those of Type1.ksy. The problem now is that Type1.1.1 accesses Type1 using _root, which is not possible if used by Type2 anymore. Depending on how CTORs are created and called, this might only break on runtime instead of compile time.

this.dynAppError = new FmtOmsRec.FmtOmsAplVdb(this._io, this, _root);

Breaks on compile time, but might be logically wrong and might get fixed, which most likely leads to something like the following, breaking on runtime, because _root is null:

this.dynAppError = new FmtOmsRec.FmtOmsAplVdb(this._io[, this]);

To solve the problem on runtime in Type2, I would need to add a params declaration forwarding the logical root Type1, which is actually the same as _root already. But I'm unable to specify this, because _root and params are handled independently. Additionally, I need to rework access to _root in Type1.1.1 to access the additionally provided params.

this.dynAppError = new FmtOmsRec.FmtOmsAplVdb(this._io[, this], logicalRootType1);

Maybe I'm missing some important discussion, but aren't _parent and _root creating problems here which we wouldn't have without those? Now that we have params, wouldn't it be better to explicitly define if types use _parent and _root using those? This way KSY-authors could opt-in into having those, those would need to be assigned a type, so no wrong invocations for _parent because it's KaitaiStruct and matches everything, and there wouldn't be a problem with params allowed to cross KSY-borders while comparable _parent and _root are not.

Actually, I'm forwarding my logical root using params already more often than really wanted because the data I'm working on does contain a lot of errors which I fix at runtime, but to decide things I need the logical root at some places. So in my case, I need to deal now with NOT providing _root while DO providing the same thing as some logical root instead using params. Would be easier if I would only need to use params instead.

I saw things like USER_TYPE_NO_PARENT somewhere as well, which I didn't fully understand, but it sounds like those concepts would not be needed anymore as well if _parent is an opt-in-param.

For backwards compatibility, one could create some meta-key to switch between implicit _parent and _root per KSY and if those are missing, handle them like currently. I would switch that off for my KSYs and define params where I need them.

_Originally posted by @ams-tschoening in https://github.com/kaitai-io/kaitai_struct/issues/275#issuecomment-368820503_

webbnh commented 4 years ago

I don't have enough brain-bandwidth at the moment to contemplate whether it would be beneficial or even feasible to get rid of _parent and/or _root at this point in Kaitai's development, but I would point out that you can probably address your "root difficulty" (eh-hem...) by replacing the use of _root in Type1.1.1 with _parent->_parent.

ams-tschoening commented 4 years ago

The point is that _root and _parent are not safe to use in all cases, those might be NULL or simply the instance itself. Things heavily depend on where types are used. Just look at the following CTORs which are always available and one can't easily know which one has been used when instantiating a type:

public ParOmsAfl(KaitaiStream _io) {
    this(_io, null, null);
}

public ParOmsAfl(KaitaiStream _io, KaitaiStruct _parent) {
    this(_io, _parent, null);
}

public ParOmsAfl(KaitaiStream _io, KaitaiStruct _parent, ParOmsAfl _root) {
    super(_io);
    this._parent = _parent;
    this._root = _root == null ? this : _root;
    _read();
}

I need to work with data in which the same types might be reused in multiple different levels and one sometimes can't easily know if a _parent is available at all and how much levels up a parent type of interest is. Additionally, one might need to cast parent types or roots because of KaitaiStruct.

params OTOH are nothing more than enhancing/influencing CTORs and I therefore suggest of getting rid of the automatically created ones and only use custom ones. This provides full control over CTORs and solves trouble with not knowing which automatically generated CTOR has been called at runtime with which pointer being NULL or problems like that. The current handling of default-CTORs vs. params regarding KSY-borders is inconsistent as well.

Of course this is a major change, but I think one can implement that backwards compatible and it makes especially runtime more robust in the long term. It's at least worth considering.