mouse07410 / asn1c

The ASN.1 Compiler
http://lionet.info/asn1c/
BSD 2-Clause "Simplified" License
93 stars 70 forks source link

jer: Pure hexadecimal JSON OCTET STRINGs #193

Closed v0-e closed 2 months ago

v0-e commented 2 months ago

As per X.697 25.3 2021, a JER OCTET STRING must be composed of only hexadecimal characters (inside quotation marks). This PR makes the JER decoder reject OCTET STRINGs which have whitespaces/similar characters. Tests for JER OCTET STRINGs are also added.

mouse07410 commented 2 months ago

Are you sure you want to tighten the decoding part? I'd be happy to make the encoder stricter in what it outputs - but doesn't this run against the Internet wisdom of "Be liberal wrt. what you receive, and conservative wrt. what you send"?

v0-e commented 2 months ago

I don't dislike that philosophy :) However the decoder was allowing stuff like newlines which are invalid in JSON strings, e.g.,

"abc
def"

Would you prefer:

  1. allow only JSON and JER compliant encodings (e.g., strict hex chars, as in the current PR);
  2. allow JSON-compliant and non-JER encodings (e.g., reject newlines \n and allow whitespaces );
  3. allow non-JSON and non-JER encodings (e.g., allow any hex and non-hex char (4. or just allow the "whitespace" chars previously allowed before the proposed changes) in OCTET STRINGs and just parse the hex chars)
mouse07410 commented 2 months ago

However the decoder was allowing stuff like newlines which are invalid in JSON strings . . . . . . Would you prefer:

  1. allow only JSON and JER compliant encodings (e.g., strict hex chars, as in the current PR);
  2. allow JSON-compliant and non-JER encodings (e.g., reject newlines \n and allow whitespaces );
  3. allow non-JSON and non-JER encodings (e.g., allow any hex and non-hex char
  4. or just allow the "whitespace" chars previously allowed before the proposed changes) in OCTET STRINGs and just parse the hex chars

My main (only?) concern is breaking currently-working code by refusing to support what it currently consumes successfully.

Do we know how much of non-compliant encodings are floating around now?

From your list, the most tempting is (4), maybe with newlines allowed too in order to support breaking very long strings into "displayable"/"editable" shorter lines...

What is your opinion?

v0-e commented 2 months ago

My biggest concern would be a possible disparity between decoders leniency. For example asn1c is able to decode some JSON object but X commercial JER-decoder/Y open-source JER-decoder, or even a plain JSON validator fail to parse it.

Ultimately in this aspect, I guess it is a choice between using asn1c as a validator or as a "generous" decoder.

My main (only?) concern is breaking currently-working code by refusing to support what it currently consumes successfully.

Our JER decoder implementation is somewhat recent, probably there isn't much code using it :smile:.

Do we know how much of non-compliant encodings are floating around now?

I'd say there are more "JER encoders" out there than any other encoding rules. Given the simplicity of creating a JSON in your typical JS/TS environment, developers may be more willing to create their own (and non-compliant) JER encoder/decoder implementations. Plus considering this is a text-based encoding, there is a higher degree it can be incorrectly (manually) modified.