Closed lihaoyi closed 9 months ago
Hey @lihaoyi, thank you for improving upickle. However such a change should lead to a new minor release as recommended here.
This is unfortunate because your library claims to follow the versioning scheme semver-spec
but in practice it does not follow it, which could lead to bad surprises.
This change is meant to be source and binary compatible. Your link states that such a change should be a patch version
@ckipp01 has raised some issues with our mima config that caused some breakage to slip through. That's a seprate issue: it's just a bug and I will fix it and cut a new issue
In general, adding a new method can break the source compatibility. However, since in your case you added an overload maybe the situation is different, I am not 100% sure.
sortkeys
is new API, hence we need to release it as a new minor version. The binary compatibility shims added here are just technical details. But if they would be not there, we would need to bump the major version, as it results in some API removals.
In a patch release, no API changes are allowed, only internal changes.
This PR allows anyone writing JSON to pass in
sortKeys = true
to make the object keys get output in sorted order.Major changes:
Introduced a new
upickle.core.BufferedValue
type. This is a sealed trait hierarchy that can be used to capture all the method calls received by aupickle.core.Visitor
, and re-play them later. It replacesIndexedValue
, which has JSON-specific idiosyncracies (e.g. converting MsgPack visitor calls to their JSON equivalents) whereasBufferedValue
precisely captures theVisitor
method calls unchanged.IndexedValue
is left for binary compatibility but no longer used internally, even its ordering use case of buffering up things until a$type
tag is found has been replaced byBufferedValue
As part of the generalization from
IndexedValue
toBufferedValue
, we now need to handle non-JSON usage patterns such as non-string-keyed objects (e.g. found in msgpack). This makes sorting these objects non-trivial, as their keys can now be arbitraryBufferedValue
s.BufferedValue.valueToSortKey
is a rough hack to traverse any structured keys and make them sortable.Intercept all
transform
calls in the primary upickle/ujson APIs withBufferedValue.maybeSortKeysTransform
. This checks the newsortKeys
boolean flag, and if true it intercepts the visitor calls, captures them in aBufferedValue
, recursively traverses and sorts theBufferedValue
's object keys, then re-plays it back outBinary compatibility stubs have been added to all methods who received a new
sortKeys: Boolean = true
parameter. These stubs were not marked as@Deprecated
, because in some cases the Scala compiler would preferentially resolve the stub over the method-with-default-value, causing spurious deprecation warnings.Testing
sortKeys
unit tests in uJson and uPickle, and has been added to theTestUtil.scala
round-trip tests to ensure that all existing tests exercise the a subset of thesortKeys = true
codepaths for JSON and MsgPack and verify that the round-tripping is still successfulNotes
It took a surprising amount of code to make this work well, as it goes against uPickle's "streaming-no-intermediate-data-structures" way of doing things.
sortKeys = true
has to buffer up some kind of intermediate JSON AST somewhere. I would expect performance to be worse than the default zero-intermediate-datastructures visitor approach, and so optimizing thesortKeys = true
code paths was not a priority in this PR. If anyone wants maximum performance, they can leave it as the defaultsortKeys = false
The whole
Transformer
/transform
abstraction in the codebase is kind of half-baked and unsatisfactory, but I left it mostly as is for now with only small tweaks to make things work (e.g. allowing the implementation ofBufferedValue.maybeSortKeysTransform
to live inupickle.core
). Realistically, the whole nameTransformer
and the concept it represents probably needs to be overhauled, but that's out of scope for this PRTestUtil.scala
is getting big and messy, and I'm adding to the mess. It's probably OK to leave for now, but it could definitely use a cleanup