helix-editor / nucleo

A fast and convenient fuzzy matcher library for rust
Mozilla Public License 2.0
899 stars 28 forks source link

Expose and persist global item indices #56

Closed alexrutar closed 2 weeks ago

alexrutar commented 2 weeks ago

Currently, there is a Snapshot::get_matched_item which converts a match index n into the corresponding Item. However, accessing the global match index does not seem to be possible via the Item, or via any other of the methods in the Snapshot.

Would it be reasonable to make the additional API guarantee to

  1. Expose the internal global match index (i.e. the index in the internal boxcar::Vec) associated with an item, and
  2. Guarantee that once a given match index exists, it will continue to for the remainder of the lifetime of the matcher (short of calling restart).

I guess there are at least a few easy ways to do this (include the index inside the Item, add a method Snapshot::get_matched_item_index, ...) As far as I understand, this is currently the case internally. Actually, the global match indices already seem to be exposed by Injector::push.

My motivation for such a feature /API guarantee is from the downstream nucleo-picker crate. There are two main features that would benefit from this API change:

Of course, I understand that in concrete implementations, it is easy to enough to perform such de-duplication directly within the type (e.g. by implementing Hash). However, since nucleo-picker is also generic over the data type, I would have to subsequently impose any such requirements on the downstream user, which I would prefer to avoid.

pascalkuthe commented 2 weeks ago

It's been a while but I think the intent was that the get_matched_item functions were just convenience functions and that there would be a fn matches(&self) -> &[Match]. The steuxt is public already so I think that was probably my intend and just forgot to add this function.

It would be the API I would prefer here in general. The main reasln that the carious functilns that return an item exists is that there is quite a bit of a performance advantage to using get_item_unchechecked and I didn't want to make the jser write async code.

I want to note that the reason this never came up in helix is that we support restarting matches (like with interactive global search where the regex search restsrts if you change the regex query). At that point you can't use the index for caching and selection since the index can change between restart.

If you don't restart using the index is fine and I think having the API makes sense. But for a preview cache I do think a hashtable based cache is usually the better option (like we do in helix)