Open Mingun opened 2 months ago
To be honest, I don't see much value in these elaborate tables. In summary, type checking:
<type>.instances.<instance>.if
(parse instance)" (as demonstrated by formats_err/expr_field_unknown_if_inst_pos.ksy
, which I recently added) and<type>.instances.<instance>.if
(value instance)" (see formats_err/expr_field_unknown_if_inst_value.ksy
).So the second "<type>.instances.<instance>.if
(value instance)" table can be completely disregarded, because this is not how things usually work and there's no point in overanalyzing it. The only takeaway is that type checks are not run there. All it takes to make it look like exactly like the first table is to run the forgotten type checking on the if
key of value
instances as well (which is literally one line of code in KSC).
Other remarks in the first table are just repeating what has been reported before in other issues:
❓ seq
Allowed If seq
depends fromme
, then should be denied (currently it is allowed and we get error in runtime)
❌ me
Allowed Check is performed before the field will be read. Access to the self getter is generated which leads to infinity recursion
See https://github.com/kaitai-io/kaitai_struct/issues/685
_
,_index
,_is_le
,_on
See https://github.com/kaitai-io/kaitai_struct/issues/404
I don't really see anything new in what you wrote, except for the simple fact that "<type>.instances.<instance>.if
(value instance)" is missing type checks (which seems to be only demonstrated by the expr_field_unknown_if_inst_value
test that I added just 2 days ago, so I don't blame you for not noticing and thanks for reporting it anyway).
I thought you would already know about the 2 existing issues because you've encountered them in the past:
https://github.com/kaitai-io/kaitai_struct/issues/926#issuecomment-1166429560
(https://github.com/kaitai-io/kaitai_struct/issues/685 seems to be a better match, but https://github.com/kaitai-io/kaitai_struct/issues/469 is related)
Or let's clearly indicate that this issue only reports missing type checks on if
in value instances so that it can be closed once the expr_field_unknown_if_inst_value
test passes.
Or let's clearly indicate that this issue only reports missing type checks on
if
in value instances so that it can be closed once theexpr_field_unknown_if_inst_value
test passes.
Yes, the intention was just about this missing check. Those tables just an extract from my work-in-progress of documenting in which contexts each special variable:
That is why it contains more information than needed just for this issue.
The idea that with each property with expression an unique context should be assigned which determines which special variables can be used and how they should be translated. To achive that I firstly want to create test cases, check how things work currently and establish a correct behavior, because it seems that this work was never done before. Some issues created here and there but without overall picture.
Yes, the intention was just about this missing check. Those tables just an extract from my work-in-progress of documenting
If you want to report one bug, open a simple issue reporting that one bug. If you want to create a far-reaching issue that attempts to solve world hunger, please do so in another issue.
Some issues created here and there but without overall picture.
Please don't tell me that you knew about the existing issues but didn't even bother to look them up and refer to them.
Huh?
Below the comparison between parse instance and value instance. There are many other problems with variables, but in this issue lets focus on missing checks of expression type:
following by the reproduction case and description of each variant.
Please don't tell me that you knew about the existing issues but didn't even bother to look them up and refer to them.
No, I don't. Thanks that you found related issues. The most obvious "type check" request found nothing similar to this issue.
Actually, if
in value instances should be prohibited. I do not see any real usage for it. For the
meta:
id: issue1129
instances:
me:
value: 1
if: _
...the following JS code is generated:
Object.defineProperty(TestType.prototype, 'me', {
get: function() {
if (this._m_me !== undefined)
return this._m_me;
if (_) {
this._debug._m_me = { };
this._m_me = 1;
}
return this._m_me;
}
});
As it can be obviously seen, the if
here is useless. It is allowed by this code
https://github.com/kaitai-io/kaitai_struct_compiler/blob/8610e1d7a796aad7addb64a5be2f9da48289bffc/shared/src/main/scala/io/kaitai/struct/format/InstanceSpec.scala#L42-L49
that was added in
https://github.com/kaitai-io/kaitai_struct_compiler/commit/67e5a3cc477b13c8204446544721c2538f0dc53d
and if
expressions was allowed in https://github.com/kaitai-io/kaitai_struct_compiler/commit/d6b4fbaac71f2b5d8436095a67317c5db9462f27
@GreyCat, why you added if
to value instances?
@Mingun:
Actually,
if
in value instances should be prohibited.
No, it shouldn't. This sounds like you don't have much experience writing or perhaps also reading .ksy files (I recommend doing it more, I've always enjoyed it). Ideally, if a .ksy specification is written properly and you're parsing a binary file that follows the format that the .ksy spec is for, you should be able to invoke all value instances and not get any errors or values that are meaningless because they're not applicable in the particular situation (so not only would they be calculated unnecessarily, but mainly their values could be misleading). Such errors or meaningless values would just confuse the user that inspects the object tree in visualizers, and especially nonsensical values would encourage misinterpretations when a user processes the Kaitai Struct object in their application.
What do I mean by "errors"? Using if
in a value instance allows you to avoid "null pointer exceptions" when you want to use a conditional field in the value
expression, e.g.:
seq:
- id: has_foo
type: b1
- id: foo
type: b7
if: has_foo
instances:
foo_as_signed:
value: (foo ^ 0x40) - 0x40
if: has_foo
doc-ref: https://graphics.stanford.edu/~seander/bithacks.html#VariableSignExtend
Thanks to the if: has_foo
on foo_as_signed
, even if has_foo
is false
and we invoke the foo_as_signed
instance, we won't get a NPE. This makes sense: foo_as_signed
is derived from foo
, so if foo
is null
, foo_as_signed
should be null
as well.
But it's not only about NPEs. Sometimes you want to clearly mark the instance as "not applicable" even if a value could be calculated without errors. As one example of many, take a look at zip.ksy:129-145
. The specification would work even without these if
s, but then the user would have to be extra careful which values are actually meaningful and which are nonsensical - it would certainly be confusing.
Currently I only check
<type>.instances.<instance>.if
, but I have feeling that the same problem exists for other fields where expression is allowed.Below the comparison between parse instance and value instance. There are many other problems with variables, but in this issue lets focus on missing checks of expression type:
Legend for the OK? column:
<type>.instances.<instance>.if
(parse instance)Parsed by
ParseUtils.getOptValueExpression
.Test KSY:
param
[^5]seq
[^2]seq
depends fromme
, then should be denied (currently it is allowed and we get error in runtime)instance
[^3]seq
me
[^6]unknown
[^4]unable to access 'unknown' in <type name> context
. In is better to enclose<type name>
in quotes_
java.util.NoSuchElementException: None.get
_buf
invalid ID: '\_buf', expected /^[a-z][a-z0-9_]*$/
_index
i
in JS). Must be denied, because condition is checked before the loop, so even withrepeat
using this variable has no sense_io
_is_le
invalid ID: '\_is_le', expected /^[a-z][a-z0-9_]*$/
. If<type>
supports variable byte order, should be allowed_on
java.util.NoSuchElementException: None.get
_parent
_root
_sizeof
<type>
's size<type>.instances.<instance>.if
(value instance)Parsed by
ParseUtils.getOptValueExpression
.Test KSY:
param
[^5]seq
[^2]seq
depends ofme
instance
[^3]instance
depends ofme
me
[^6]unknown
[^4]this.unknown
in JS)_
_
in JS)_buf
this._buf
in JS)_index
i
in JS). Must be denied, because condition is checked before the loop, so even withrepeat
using this variable has no sense_io
_is_le
this._isLe
in JS). If<type>
supports variable byte order, should be translated to check of the right field (this._is_le
in JS)_on
on
in JS)_parent
_root
_sizeof
<type>
's size[^2]: attribute of the enclosing type in
seq
section [^3]: instance of the enclosing type ininstances
section [^4]: non-existent attribute or instance in the enclosing type [^5]: parameter of the enclosing type [^6]: reference to self