halaxa / json-machine

Efficient, easy-to-use, and fast PHP JSON stream parser
Apache License 2.0
1.1k stars 65 forks source link

Unicode Escape Sequences Support #108

Closed XedinUnknown closed 1 year ago

XedinUnknown commented 1 year ago

Hi!

This library works very well in my project, thank you very much for making a reliable streaming JSON parser!

The Problem

Currently, there are some strings in the API response that I need to parse, which contain links that look like this:

https://click.justwatch.com/a?cx=eyJzY2hlbWEiOiJpZ2x1OmNvbS5zbm93cGxvd2FuYWx5dGljcy5zbm93cGxvdy9jb250ZXh0cy9qc29uc2NoZW1hLzEtMC0wIiwiZGF0YSI6W3sic2NoZW1hIjoiaWdsdTpjb20uanVzdHdhdGNoL2NsaWNrb3V0X2NvbnRleHQvanNvbnNjaGVtYS8xLTItMCIsImRhdGEiOnsicHJvdmlkZXIiOiJBcHBsZSBUViIsIm1vbmV0aXphdGlvblR5cGUiOiJidXkiLCJwcmVzZW50YXRpb25UeXBlIjoiaGQiLCJjdXJyZW5jeSI6IlVTRCIsInByaWNlIjo1MTkuNzQsIm9yaWdpbmFsUHJpY2UiOjAsImF1ZGlvTGFuZ3VhZ2UiOiIiLCJzdWJ0aXRsZUxhbmd1YWdlIjoiIiwiY2luZW1hSWQiOjAsInNob3d0aW1lIjoiIiwiaXNGYXZvcml0ZUNpbmVtYSI6ZmFsc2UsInBhcnRuZXJJZCI6MTI3MCwicHJvdmlkZXJJZCI6MiwiY2xpY2tvdXRUeXBlIjoianctY29udGVudC1wYXJ0bmVyLWFwaSJ9fSx7InNjaGVtYSI6ImlnbHU6Y29tLmp1c3R3YXRjaC90aXRsZV9jb250ZXh0L2pzb25zY2hlbWEvMS0wLTAiLCJkYXRhIjp7InRpdGxlSWQiOjIwOTgxLCJvYmplY3RUeXBlIjoic2hvdyIsImp3RW50aXR5SWQiOiJ0czIwOTgxIn19XX0\u0026r=https%3A%2F%2Ftv.apple.com%2Fus%2Fshow%2Fsurvivor%2Fumc.cmc.6ozd0mt09a86bpa19l885jv4z\u0026uct_country=us

When I render this for the users, the URL is broken.

Possible Cause

  1. The Unicode escape sequences e.g. \u0026 . The slash breaks URL parsing, because it is illegal there.
  2. Also, this sequence corresponds to the & (ampersand) character, which should be there in its place. This breaks the link itself.

Suggested Solution

According to the JSON spec, such Unicode escape sequences up to 4 numbers long should be decoded automatically.

Add support for Unicode escape sequences, according to spec.

halaxa commented 1 year ago

Would you find some time to make a failing test with this behavior in a PR?

halaxa commented 1 year ago

I added a passing test on master in 5c1d6c1 Can you check it?

XedinUnknown commented 1 year ago

Heya! Unfortunately, I don't have access to the project anymore. But the test looks like it tests for this, and if it's passing - I'd say it's good.

halaxa commented 1 year ago

Ok, thanks anyway ;)