dglazkov / polymath

MIT License
132 stars 9 forks source link

Have libraries have a stable order of content chunks #50

Open jkomoros opened 1 year ago

jkomoros commented 1 year ago

The current design of libraries erroneously assumes that the order of chunks of content in the content dict are ordered. That isn't true in Python and isn't technically true in JSON either.

That leads to weird things like get_chunk_infos_for_library having to re-sort content that was sorted when selected and truncated, but then its sort order was lost.

Perhaps we amend library format like this:

{
  //...
  //If omitted, defaults to {"type": "any"}
  "sort": {
    //May be omitted if "any". The type of sort that will be maintained. May be any value that Library.query(sort=) accepts.
    "type": "any"
    //Whether the sort is reversed. (Sorts are descending by default) If false, may be omitted.
    "reversed": true,
    //If "type" is anything other than "any", then every id in "content" must be here, in sorted order consistent with the "type". This is the order that Library.chunk_ids will return, and thus Library.chunks will respect.
    "ids": ["abc", "def", ...]
    //Seed may always be omitted. If type='random' and it's provided, then the sort of ids will be in an order that is consistent with providing this seed. Note that if you add a new item, it will scramble the entire order of all items; but the sort order will be consistent for sorting the list of ids given this seed.
    "seed": "abc"
  }

  //Content is still a dict without order
  "content": {}

  //...
}

Library.set_chunk() and friends will maintain the sorted order that the library has been configured with if it's not any when things are inserted or modified in a way that might change their sorted order.

This blocks further progress on solving #14 in a sustainable way.

jkomoros commented 1 year ago