Open Mingun opened 4 years ago
Can you use #435 for it, with something like any-of
?
Until I understand how it will help. What exactly it should validate?
I had the thought of suggesting a new bitmap
type, maybe as replacement of seq
:
meta:
id: iso8583
bitmap:
size: 64 # 64bit
# bit, in which secondary bitmap is located. Some ISO 8583 variations
# uses not first, but last (64) bit for secondary bitmap
continue: 1
# number of bits, [1..size]. "continue" bit can not be used
2:
id: pan
type: str2
64:
id: mac
size: 8
encoding: hex # an example of #668 - read 8 bytes, decode hex and get 4 bytes in mac
types:
str2:
...
Um, title suggested that you wanted to stop parsing by raising an error. I don't quite understand the condition on which you want to stop, but if you can explain it, then probably that can be used as validation expression. Failing validation raises parse error.
As far as I understand, you have a sequence of bits that you either know and can describe what a bit stands for (and which additional payload it should bring), or you don't know how to handle that payload and thus you demand that bit to be zero (otherwise that means you've encountered something you can't parse). Thus, it boils down to:
seq:
# "bitmap"
- id: has_secondary_bitmap
type: b1
- id: has_pan
type: b1
- id: reserved1
type: b62 # consumes the rest of 64-bit sequence
valid: 0 # we expect that no other bits are used
# payloads that we know how to parse, depends on "bitmap" bits
- id: secondary_bitmap
# ...
if: has_secondary_bitmap
- id: pan
# ...
if: has_pan
Unfortunately, I don't really understand the proposal for "bitmap" type as replacement for "seq". Can you clarify the problem you're trying to solve with it?
Yes, all exactly as you describe. But, because I suspect, there can be similar formats, that has defined rules to determines payload boundaries, it will be also useful to allow define default case, that will be used for not described fields.
Unfortunately, I don't really understand the proposal for "bitmap" type as replacement for "seq". Can you clarify the problem you're trying to solve with it?
Not a replacement in the sense of completely removing seq
. Just another case of top-level structure of the type. Should automate description of the type, that you manually written above, but without necessary duplicate names (XXX
and has_XXX
), which can lead to mistakes. Besides, that special type definition can produce better errors. In your example error will look like:
"reserved1 should be 0, but XXX is encountered" or similar. Or, if describe it as
- id: unknown
type: b1
repeat: expr
repeat-expr: 62
valid:
all: false # Arbitrarily choosed syntax to check individual elements of array
"unknown[XXX] should be false
but it is true
".
If anything, I don't see how an error in such a very general mechanism can be improved for that case.
Also, it is trivially to specify default case in that syntax:
# Raises a parse error with nice message if encounter bit, not described in the message, ex.:
# "Unknown field #X in a bitmap message <type name>"
bitmap:
size: 64
1: secondary_bitmap
2: pan
...
# All not described fields parsed with `default` type
bitmap:
size: 64
default: # regular type definition (bitmap also possible :))
seq:
- id: len
...
# default: default_case # also accept that
1: secondary_bitmap
2: pan
...
Also, it shouldn't be forgotten that non-continuous regions of bits can be defined (ex: 1, 10, 20), i.e. several reserved definitions will be needed and all of them will need to be checked, if some checks will be required. You will have to manually maintain bitmap and field definitions in sync
I don't see much value in this top-level structure proposal: bit flags already provide pretty solid implementation, and it's much more flexible. I don't see having xxx
+ has_xxx
as big burden, and having it spelled explicitly allows you to control order in which these additional payloads are appearing, for example:
seq:
- id: has_foo
type: b1
- id: has_bar
type: b1
- id: reserved
type: b6
value: 0 # no need for complicated all elements of bit array are 0 checks
- id: more_attributes_here
size: 1234
# there could be actually more attributes here
- id: bar # bar comes 1st
# ...
if: has_bar
- id: foo # foo comes 2nd, despite order of bits if "foo", "bar"
# ...
if: has_foo
"reserved1 should be 0, but XXX is encountered" "unknown[XXX] should be
false
but it istrue
"
I don't see how these two are substantially different: both do the job and both allow to pinpoint what exactly is false. On the contrary, addressing bits as [XXX]
is generally very confusing, as there is no universal agreement on what "bit 27", for example, is.
I don't see how these two are substantially different:
I say both messages are equally bad. And I see no way to improve them in the current implementation
Ok, what message would you consider a better choice?
As I say above in code example, something like
Unknown field #\
in a bitmap message \ . Unknown fields breaks message layout, so can't be skipped and prevent further parsing
or
Not known how to parse field #\
or it's size. Field can't be skipped in a bitmap message \
where <X>
is bit number
In the financial industry the ISO 8583 protocol is widely used. This protocol consist of:
For example, first 2 fields of many ISO 8583 implementations can be described by the following
ksy
schema:Protocol itself is not describes, how to determine field length. So it is not backward-compatible, if you add new field between already defined fields, because old clients will not known, how to skip unknown field -- they do not known its length. That means, that parsers must throws an exception, when it comes with unknown field.
How to express that in
ksy
? Right now, if deletesecondary_bitmap
member from definition and try to parse message, that contains both,secondary_bitmap
andpan
fields,ksy
will parse bytes fromsecondary_bitmap
, aspan
field