Properties/Content needs clarity

wmo-im / wis2-notification-message

WIS 2.0 MQP message to notify users of availability of new data

https://wmo-im.github.io/wis2-notification-message

2 stars 2 forks source link

Properties/Content needs clarity #98

Closed amilan17 closed 9 months ago

amilan17 commented 9 months ago

confusion about compressed vs uncompressed requirement remove resulting size? if data is under 4k, can the size of the notification be over 4k?

note, revision of this text may also impact text of /per/core/additional_properties

amilan17 commented 9 months ago

@david-i-berry @tomkralidis @wmo-im/tt-wismd

tomkralidis commented 9 months ago

Ref: https://wmo-im.github.io/wis2-notification-message/standard/wis2-notification-message-DRAFT.html#_properties_content

For discussion/decision at next TT-WISMD meeting.

josusky commented 9 months ago

This is related to #6 I agree that the current formulation is somewhat ambiguous. Especially the sentence "The value must be below 4096". This does not match the reality. The size is size of the original data. Thus, if we have an IWXXM message whose size let say 5000 bytes, and the encoding is gzip it can be embedded. The length of the properties.content.value will be about 2000 bytes. That is, the properties.content.size will be above 4096, but the resulting WNM will have only about 3000 bytes. (The numbers are rounded values taken from a real life example)

josusky commented 9 months ago

And of course, we have opposite situations, when a base64 encoded BUFRs have properties.content.size indicating lower value than is the actual size/length of the properties.content.value. BUFR SYNOPs are typically just a few hundred bytes long but I can imagine (I wasn't able to quickly find an example circulating in WIS2 at the moment) an upper air (TEMP) sounding whose size in its raw BUFR representation is slightly below 4096, but due to the base64 encoding would be slightly over the limit - and therefore shall not be embedded into WNM.

kaiwirt commented 9 months ago

Given what @josusky says to me it would only make sense to limit the size of the uncompressed / unencoded data. This might lead to a larger (or lower) size for the encoded content. The size will only be larger to a limited extend (for the Base64 encoding).

If we limit the size of the encoded content, then one might need to first encode the content, then check the size and then maybe drop the content if it is too large.

However i would also want to discuss if we can remove the content size limitation completely and point to the overall size limit of the message?

josusky commented 9 months ago

The total WNM size has been discussed in #3 and as the result we have in /req/core/message_size (chapter 7.1.2. Message size):

A WNM message SHALL NOT exceed 8192 bytes.

Here we are discussing just the embedded data properties.content and the meaning of its size property. The limit of 4096 for the embedded content means that there are another 4kB for the other elements of the WNM. I agree with @kaiwirt that compression is somewhat tricky and requires some heuristics, but:

it is optional
the benefit is significant enough to make it worth while

josusky commented 9 months ago

PR #102 was created with new, hopefully more clear, formulations.

kaiwirt commented 9 months ago

@josusky I was just wondering, what the additional benefit of having a size limit on the content is given that we already limit the size of the whole message. But i am also happy with the merge.

josusky commented 9 months ago

@kaiwirt I do not remember all details of the discussions but I think the main reason was that the total size is more difficult to be checked in advance and the most probable reason when the WNM could be over the limit is when it has embedded data (properties.content) - this is a "half" solution that could solve 95% of cases :-)