There are three tweaks to the formats of libraries that are breaking changes:
1) Switch to have chunks be a sorted list #56
2) Switch IDs to be a hash of the text + url #33
3) Change the name of content in Library #23
Each of these changes is actually pretty small, especially given how everything is factored. But it's probably best to lump them all together and do a version bump.
[x] Create a branch for the version bump, and do all of the following on a branch that is landed all at once
[x] Bump the version from 0 to 1
[x] Introduce machinery in Library().init to detect an old version and call upgrade(data), to continually apply upgrade transformations until it is CURRENT_VERSION.
[x] Add an upgrader from to 0 to 1. To start it will just change the version to 1. (For all of the later steps, for each change, also tweak the upgrade script, and ensure the webclient works)
[x] Switch it so chunks are at data['bits']
[x] (Don't forget to update all of the exposed methods/properties to be around 'bits', not 'chunks'. That can be done separately).
[x] Update naked library format to be more like a naked library (bits, etc) and also update card-web exporter
[x] Make bits a list of chunks. Remove sort.ids.
[x] Medium importer joins paragraphs with '', not '\n' (or maybe that's an artifact of the print(self._data))
[x] importer.get_chunks can just return the raw chunk, no need to return the id
[x] Isn't there a problem where a Chunk changes its e.g. similarity and the whole thing should be potentially resorted?
[x] Move sort.reversed to sort_reversed, sort.type to 'sort', and sort.seed to seed.
[x] Remove the ability to pass a separate, non-canonical id to the Chunk constructor
[x] Consider removing Library.chunk_ids
[x] Clean up all instances of Chunk in APIs
[x] Add a --upgrade facility that runs through every library file in libraries, loads them up (which implicitly upgrades them) and then resaves if anything has changed.
[x] Land the branch, and tell everyone in the '#endpoints` channel to run the upgrader.
[x] After we're done, and everyone has upgraded, we can have everyone just swap the version number down at once since there are so few, but it's good to practice the version bumping and make sure it works, since it's an important part of what will make it actually federated.
[x] If each upgrader also has a downgrader, then hosts can downgrade to the target version to respond to old clients, too.
There are three tweaks to the formats of libraries that are breaking changes: 1) Switch to have chunks be a sorted list #56 2) Switch IDs to be a hash of the text + url #33 3) Change the name of
content
in Library #23Each of these changes is actually pretty small, especially given how everything is factored. But it's probably best to lump them all together and do a version bump.
bits
a list of chunks. Remove sort.ids.--upgrade
facility that runs through every library file in libraries, loads them up (which implicitly upgrades them) and then resaves if anything has changed.