indexmap-rs / indexmap

A hash table with consistent order and fast iteration; access items by key or sequence index
https://docs.rs/indexmap/
Other
1.71k stars 150 forks source link

Advice about userland indexmap wrapper for clustered inputs #356

Open Yomguithereal opened 1 day ago

Yomguithereal commented 1 day ago

Sorry about the fuzzy issue name but I currently use indexmap to create on my end a custom hashmap-like structure able to minimize the number of lookups when the data comes sorted or mostly sorted (in chunks often). Thinks of a table with a column used as key and another as value, and rows where it is frequent to have the same key repeated multiple times before changing.

Since indexmap stores a dense vector of entries and can swap indices in constant time, I use it to check if the last entry has the same key as what I want to insert before even trying to insert normally through a hashmap lookup. And, if the hashmap lookup gets an occupied entry, I swap it to put it in last position so that next insert with the same key can forego said lookup.

This niche hashmap can very well be developed from userland and you can find one here: https://github.com/medialab/xan/blob/master/src/collections/clustered_insert_hashmap.rs

Where I struggle though is to expose some kind of entry interface to my users. I can very well work with callback-based methods as demonstrated here: https://github.com/medialab/xan/blob/master/src/collections/clustered_insert_hashmap.rs#L78-L115 but they have their issues as well and I would love to be able to expose some kind of entry system. But I need to be able to work both with indexmap::map::Entry and indexmap::map::IndexedEntry, as well as swapping an occupied entry for the user if relevant, but I cannot make it work because the borrow checker does not let me or because entry methods swapping indices takes ownership of the entry and I cannot return it to the user afterwards. I cannot either use a DropGuard because I cannot access the mutable reference to the map from the entry, from userland at least.

So I was wondering if you would have some advice, or if there is something obvious I could contribute here.

cuviper commented 1 day ago

but I cannot make it work because the borrow checker does not let me or because entry methods swapping indices takes ownership of the entry and I cannot return it to the user afterwards.

Can you give a more complete example of what you would like to be able to do here?

My initial though is that locally, you could get an entry, save its index, and perform the swap, and then construct a new IndexedEntry (which is cheap) for passing back to the user. There's also From between that and OccupiedEntry in both directions, and while it does require a hash table lookup to convert to occupied, that's a simple index comparison rather than a full hash/eq lookup.