Closed davidlatwe closed 6 years ago
The dataset_sample/mixed_type.json
in c8f8688 is an ugly sets of data, the reason of that was for testing the sort
method to make sure it behave the same as Mongo when sorting all kind of data together.
sort()
usage will change after this merge.
# Before
collection.find().sort({"field": direction})
# After
collection.find().sort("field", direction)
# or
collection.find().sort("field.child-field", direction)
# or
collection.find().sort([
("field1.child-field", direction1),
("field2", direction2),
])
sort()
is working after this merge.Using Python to reproduce MongoDB's document sorting behavior
ASCENDING = 1 # small -> big
DESCENDING = -1 # big -> small
[] ASCENDING
None |
Numeric v
String
Dict
List 1d ^
List Nd |
Bool DESCENDING
To decide which data is bigger or smaller, firstly sort data by type in the order above, then compare the value.
If the sorting result is equal, then sort by data found order.
MongoDb Manual - Bson Types Comparison Order
List type will use smallest (in ASCENDING) or largest (in DESCENDING) member to compare with other data, therefore, the array will down grade one dimension in sorting, which means, one dimensional array will sorting with other data which is same type as array's comparing member. If those smallest or largest members are equal, sort by data found order.
ASCENDING : smallest member
DESCENDING : largest member
Dictionary will iter key-value to compare with other dict
type doc, first by value's data type, then sorting by key string, lastly by value, moving to next key-value if sorting result is equal, sort by data found order if all key-value are equal.
valueType_1 -> key_1 -> value_1 -> ... -> valueType_N -> key_N -> value_N
If value is None
or [None]
, sorting each other by found order.
If field is not found, treat as None
.
[]
If value is an empty list []
, treats as less than null.
# basic form for comparing
(section(int), (type(int), value(*)), index(int))
For realising multi-sort, need to prevent the next sort action overriding the last sorting result, by making sections (groups).
A set of docs with the same section id means the key-value they have in this sorting action are the same.
In order to compare with variety of types of value at the same time, comparing a weight value of value type before value itself.
[]
( less then null )
# weight -1
(-1, [])
None, NoMatch
# weight 0
(0, None)
Numeric, String, Bool
# weight 1
(1, 3.1415)
# weight 2
(2, "Hello")
# weight 5
(5, True)
Dict
# weight 3
(3, (2, "color", "#FF00AA", ..., type-N(int), key-N(str), value-N(*), ...))
List
When comparing with docs :
# Looking for min/max member to compare with other docs
min([(1, 72), ..., (type-N(int), value-N(*)), ...]) # or max
When comparing with other List :
# weight 4, iter all member inside the List
(4, ((1, 72), ..., (type(int), value(*)), ...))
This is not the doc _id
.
If the doc's key-value are equal after sorting, will order by this iteration index.
This index will always ordered in ASCENDING.
For commit 880134a
The reason for that is to reset TinyDB's _id
, without doing this, the _id
which made by TinyDB will still bumping after delete_many({})
, and will make the doc found order inconsistent in every time you repeat delete_many({})
and find()
, which is bad for testing.
Sorry for the info booming, I think that's all what I need to add up in this merge. Please let me know if there are any issue.
Motive
To behave more like Mongo.
Changed
$all
sort_specifiers
, and sorting basic types of dataStill more to do, but I think it's time to merge, since this had made a big change. Thank you :)