xyncro / chiron

JSON for F#
https://xyncro.tech/chiron
MIT License
173 stars 41 forks source link

Better escaping performance #78

Closed neoeinstein closed 7 years ago

neoeinstein commented 7 years ago

This is a more efficient algorithm for escaping a JSON string. The new process has time and space complexity of O(n) where n is the length of the string to be escaped (if I account properly), and requires no more than 2 full scans of the string in the worst case and 1 full scan of the string.

In the event a string has none of the characters which need to be escaped, the input string will be returned without additional allocations. Otherwise, a string builder will be allocated with initial capacity set to the length of the input string. For the cost of an additional scan of the string, we could determine the exact length of the output string. Testing indicated that this added extra overhead cost without significant benefit. Then we step through the string, stopping at each character which needs to be escaped and adding the unescaped characters and expanding the target characters. Once we have processed all the values which must be escaped, we return the new string from the string builder.

Also included is a table of escape characters. These are all the characters in UTF-8 that need escaping for JSON. Unicode characters beyond U+FFFF should not need to be further escaped due to encoding the value in UTF-8. As String.IndexOfAny does not take a predicate, this array is necessary to take advantage of the internal BCL optimizations.

On my hardware, I am achieving ~4ms for 2.5MB of string that needs no escaping and ~75ms for 2.5MB of string with significant escaping required.

neoeinstein commented 7 years ago

/cc @etaullaj

kolektiv commented 7 years ago

This looks great! Definitely a more performant way of doing things 😄 The existing implementation was very much a "do the simple thing idiomatically, don't worry about speed yet" so this is the logical next step.

I'm going to get this merged in, bump versions and update the build (unrelated!). Hopefully it should be out there today at some point...

kolektiv commented 7 years ago

A bit of a delay at the moment as I can't access my Appveyor account for Xyncro - http://help.appveyor.com/discussions/problems/5310-access-to-github-organisation-has-disappeared is a bit of a concern! Hopefully that will get sorted shortly...