Closed mitchellwrosen closed 3 years ago
Seems like these should be two different types with two different Ord
instances. So like Name
for the ordinary use, and SuffixedName
or something when wanting to search by name suffix.
While it's possible to have the "source of truth" for Name
be ordered in some other way (and that was considered), it just means you have to create and maintain auxiliary indices to support the suffix-based lookup and search that we do everywhere (and which we want to be logarithmic time). Creating and maintaining aux indices is more complicated and hasn't worked out great previously.
If this is a bottleneck, we could change the representation of Name
to be Name [NameSegment] IsAbsolute
where the name segments are in reverse order. Or even pack that into a single Text
. (So List.map
has an internal representation of "map.List"
) That would make comparisons just as cheap as comparing Text
while also giving us the suffix-based querying we want.
We previously would do an O(n) "suffixification" pass over the namespace to compute an index for supporting suffix lookups - this was slow, didn't let us determine ambiguous suffixes (https://github.com/unisonweb/base/issues/44), and it needed to be recomputed every time the namespace changed. It also was getting recomputed on every character typed in the fuzzy find.
There is nothing useful about the default lexicographical ordering of Text other than that it's fast (but if we just want that, we might just use hashing instead of Ord
). It doesn't correspond to human alphabetical ordering. It doesn't let us do suffix-based search. IMO it doesn't deserve to be the default ordering for names and it was chosen previously without really any thought. :)
More background on all this in #2290
Thanks for the background.
I totally agree that the Ord instance derived from Text is not semantically so useful or meaningful, and I really like how searchRan
can carve out names with similar suffixes.
To that end I do like the idea of changing Name
's representation to allow for a faster Ord
implementation at the expense of some other functions.
I'm not sure a newtype is the right solution here - I don't think we want any (new)type that has an expensive Ord
instance which is put into a Relation
.
I noticed in a time profile that comparing
Name
s contributes a nontrivial amount of time to computing a branch diff.Here's the
Ord
instance:Perhaps we should consider replacing this instance with something far cheaper to compute (ideally O(1)), since
compare
is often assumed to be cheap and performed many times on the fly, especially when dealing withMap
s,Set
s,Relation
s, etc.Hopefully it will be possible to achieve this without making the use cases for which we want to sort and search by name suffix prohibitively expensive. I'm not sure where we do that kind of thing currently.