openid / OpenID4VCI

66 stars 18 forks source link

Specify character encoding for application/x-www-form-urlencoded #229

Open adeinega opened 8 months ago

adeinega commented 8 months ago

I suggest to specify explicitly the character encoding for the application/x-www-form-urlencoded mime-type, thus examples such as

POST /token HTTP/1.1 Host: server.example.com Content-Type: application/x-www-form-urlencoded Authorization: Basic czZCaGRSa3F0MzpnWDFmQmF0M2JW

become

POST /token HTTP/1.1 Host: server.example.com Content-Type: application/x-www-form-urlencoded;charset=utf-8 Authorization: Basic czZCaGRSa3F0MzpnWDFmQmF0M2JW

Note, in rare cases, it can lead to interoperability issues as app servers & frameworks that run an OP do not necessarily use "UTF-8" as a default encoding character, as an example, Java Servlets use "ISO-8859-1".

It's worth noting that RFC 6749 tells about UTF-8

The client makes a request to the token endpoint by adding the following parameters using the "application/x-www-form-urlencoded" format per Appendix B with a character encoding of UTF-8 in the HTTP request entity-body:

jogu commented 8 months ago

I am struggling to see why it would be advantageous to define the token endpoint differently from how it is defined in RFC6749. I think it is pretty well understood how the RFC6749 token endpoint works.

Can you explain how this would help please?

The only possible argument I can see is that the newly defined tx_code token endpoint parameter may contain non-ASCII characters I think, but that feels better addressed by drawing attention to the existing text in RFC6749.

I'd also note that as per https://www.iana.org/assignments/media-types/application/x-www-form-urlencoded there is no charset parameter defined for application/x-www-form-urlencoded.

adeinega commented 8 months ago

You are right, it is well understood how the token endpoint works but I did not suggest redefining it differently. The charset parameter does not change anything, "application/x-www-form-urlencoded" remains to be the same "application/x-www-form-urlencoded", this charset parameter only explicitly indicates how to encode the characters in it. RFC 6749 already tell that UTF-8 should be in use + it says a bit about application/x-www-form-urlencoded in https://datatracker.ietf.org/doc/html/rfc6749#appendix-B.

Interoperability (rare) issues may arise due to the presence of non-English characters in client_secret and other places, such as redirect_uri and so forth.

jogu commented 8 months ago

this charset parameter only explicitly indicates how to encode the characters in it

Unfortunately it doesn't, the charset parameter has no defined meaning for this mime type.

adeinega commented 7 months ago

Just for the record, https://github.com/openid/OpenID4VP/issues/40 is about the same but in OpenID4VP.

https://www.iana.org/assignments/media-types/application/x-www-form-urlencoded considers only 7bit encoding.

Sakurann commented 1 month ago

per joseph's last comment, does not sound like there is any action that needs/can be taken on this?