Weebly / Cereal

Swift object serialization
BSD 3-Clause "New" or "Revised" License
369 stars 13 forks source link

Massive changes to improve encode/decode speed #19

Closed Sega-Zero closed 8 years ago

Sega-Zero commented 8 years ago

Related to #16. This pull request breaks the backward compatibility, so there should be created a v2.0.0 tag.

What have been done:

  1. Refactor encoder, use an indirect enum with associated values as a backing storage for data being encoded
  2. Remove all string-related code from both CerealEncoder and CerealDecoder and refactor all the code to use a new enum
  3. To improve the speed, internal dictionaries are replaced with arrays of tuples. This adds a bit more data into a result NSData if there is a value replacements during an encoding, but a decoder is guaranteed to decode only the last value.
  4. Introduced a new logic layer, CerealSerialization. It's function is to serialize/deserialize CoderTreeValue enum to/from NSData object. Right now, to achieve a better speed, it works with a raw-byte TLV structured data, but may be extended to any kind of data in the future: xml, json, or maybe even old string format to introduce a backward compatibility with 1.x versions.
  5. All the tests are rewritten to use a new TLV byte-arrays

Known issues: Since the length of Int,Float and Double differs on x32 (Int==Int32) and x64 (Int==Int64) platforms, the result is incompatible between platforms. IMHO, there is no need in supporting this, all the new Apple devices are x64, the x32 devices will no longer be supported in a couple of years. Those who need a complete compatibility should use the 1.x version. The tests are written for an x64 devices.

ketzusaka commented 8 years ago

The test changes are really great. How did you go about getting the values for those?

Sega-Zero commented 8 years ago

That was very challenging. I set the breakpoint on each test and then used one of this helper functions:

func printBytes(array: [UInt8]) {
    let str = array.reduce("[") { $0.0 + String($0.1) + "," }
    print(str.substringToIndex(str.endIndex.predecessor()) + "]")
}

func encode(value: (inout CerealEncoder) throws -> ()) {
    var encoder = CerealEncoder()
    let _ = try? value(&encoder)
    printBytes(encoder.toBytes())
}

And then print the result bytes, copy it and paste inside test body.

For encoder dictionary tests I wrote one more function:

func printXCT(result: [UInt8]) {
    let prefix = result[0..<47].reduce("") { $0.0 + String($0.1) + "," }
    let divider = (result.count - 47)

    let leftFrom = 47
    let leftUntil = leftFrom + divider / 2

    let rightFrom = leftUntil
    let rightUntil = rightFrom + divider / 2

    let left = result[leftFrom..<leftUntil].reduce("") { $0.0 + String($0.1) + "," }
    let right = result[rightFrom..<rightUntil].reduce("") { $0.0 + String($0.1) + "," }
    print("XCTAssertTrue(result.hasArrayPrefix([\(prefix.substringToIndex(prefix.endIndex.predecessor()))]))\nXCTAssertTrue(result.containsSubArray([\(left.substringToIndex(left.endIndex.predecessor()))]))\nXCTAssertTrue(result.containsSubArray([\(right.substringToIndex(right.endIndex.predecessor()))]))")
}

Since all the tests there was using the same key wat, I cut first header bytes, then a key string bytes (that's 47 bytes) and then split the rest by half. I added a few line breaks for an arrays that was too long or to separate dictionary subarray for more readability.

The hardest part was a decoding tests. Each test was prepared manually by setting a breakpoint and writing in console expressions like this:

po self.encode { try $0.encode([MyBar(bar: "baz"):[1.0,2.0] as [Double]], forKey: "hi") }

There was a few tests where I couldn't do this (like the error ones), so I gathered a whole byte array by pieces and then debug it accurately %)

Sega-Zero commented 8 years ago

Are there any other stuff that should be fixed before merging this PR? :)

ketzusaka commented 8 years ago

Nope, I think this is good to merge. I'm going to do some testing with my projects using this tomorrow to make sure it jives well and if all is well ill tag and release :)

Sega-Zero commented 8 years ago

Awesome! I'll finally switch my projects Podfile to upstream =)