francoispqt / gojay

high performance JSON encoder/decoder with stream API for Golang
MIT License
2.11k stars 112 forks source link

Handle string escaping correctly #56

Closed lorenzo-stoakes closed 6 years ago

lorenzo-stoakes commented 6 years ago

Hi there,

This is a larger patch from my previous (as you might have guessed, I am working on a project that uses gojay and testing against datasets :) and is more of a suggestion/beginning of a discussion than a straight-up bug fix.

In testing it seemed to me the previous escaped string decoder implementation was not behaving correctly, for example decoding "\nx" would result in an error. Additionally, it seemed like the tests were expecting the JSON to be escaped like e.g. "\n" rather than "\n".

The patch fixes these issues and updates the tests to what I feel they ought to be, though I may easily have made mistakes/misunderstood things here so please feel free to correct me.

Thanks very much for the project by the way, it's giving me a 4x speedup vs. "encoding/json" :) I hope my upstreaming fixes is helping (I'm running a slightly modified fork in my project but want to contribute back bugfixes and any changes pertinent to the unmodified library.)

Cheers, Lorenzo

The previous implementation failed on a number of basic tests e.g. "\nx", seeming to treat "\n" as "\n". It additionally seemed rather over-complicated. This patch tries for a simpler implementation.

This patch additionally adds the \/ escape chord which is supported by RFC4627 (and the "encoding/json" also.)

codecov-io commented 6 years ago

Codecov Report

Merging #56 into master will decrease coverage by 0.05%. The diff coverage is 94.28%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #56      +/-   ##
==========================================
- Coverage     100%   99.94%   -0.06%     
==========================================
  Files          30       30              
  Lines        3537     3486      -51     
==========================================
- Hits         3537     3484      -53     
- Misses          0        2       +2
Impacted Files Coverage Δ
decode_string.go 98.3% <94.28%> (-1.7%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update f4c5a97...aa9e309. Read the comment docs.

francoispqt commented 6 years ago

Let me review this and maybe add some tests. I will add this to the next release which should happen this weekend. Sorry for the delay, was on holiday.

lorenzo-stoakes commented 6 years ago

No problem, hopefully this is helpful (and hol was good :) and feel free to come back with any issues with this, it's a pretty simplistic implementation (and obv. I missed some coverage.)

Cheers, Lorenzo

francoispqt commented 6 years ago

So it has been a misunderstanding on my side about the escape sequence when encoding and your solution seems good. I'm adding some tests and checking the encoding as well.

Planning to release 0.11.0,1.1.3,1.0.4 with your pull request and 1.2.0 with new types and your pull request tomorrow.

Thanks