Closed tarasglek closed 11 years ago
By "byte-level comparison" you mean WritableComparator? That's actually very important for performance when combining/reducing, because actually creating the PyObjects is pretty expensive. I tend to think that we should not allow dicts or lists in keys, but we could allow them in values...
I'm going to take this and see if I can separate out a key class which supports comparison and no dicts and a value class with dicts and no comparison.
https://github.com/tarasglek/jydoop/pull/1 contains the changes which do what I think we want here.
I did not do != 0 comparisons for dicts because that's confusing. I'm not convinced we need that.
I also did not implement byte-level dict comparison. I'm still not sure why you implemented it.
I'm still not quite sure why we need to have 1:1 mapping between sorting based on raw bytes & by higher level datastructures. The only thing that's important is that things that are equal as higher level objects remain equal as when represented as a bytestream...how they are sorted relative to other keys seems to not be important. Is there some detail I'm missing?