Open ratmice opened 3 years ago
+1 to this functionality or something like it.
Our use case currently involves simply rebuilding the whole hashmap from scratch each time a new configuration file is downloaded rather than updating it in place. We have this implemented in erlang using ets
tables that we create and drop periodically, specifically we drop after we're reasonably sure no more readers are still referencing them, although no guarantees. evmap
would allow us to rebuild this usecase in a way, however here are some considerations:
1) the forced oplog clearly adds overhead to this work, since data first makes it inside the oplog and then goes in the handle upon publish time. Sadly I'm too new to Rust to think about a way to deal with old readers still accessing this old ReadHandle
but if there was one way, perhaps you could just give more direct access to the evmap without going through the oplog
, and simply return an error for stale reads.
2) while in the current system we can discard the new ets table if the process that is building it fails to parse the file, in this new evmap world there is no transaction concept and thus you would end up with an oplog
that is effectively invalid at this point. This is the case even if one were to implement this by calling first WriteHandle::purge()
and then re-inserting all the keys, and you would actually also pay the price of iterating through each key 2 times I think. And in reality if there's a bug in the code that is supposed to generate the new evmap version you'd end up with a quickly growing memory leak inside this oplog
.
3) the evmap::WriteHandle::extend(...)
call could be helpful here, except it only supports inserts vs all the other operations like this crate does. For example the writer could submit a big transaction of its own with all of the operations needed and make the whole thing effectively atomic via left-right
new WriteHandle::extend()
call. It's not a big deal now if the new values don't change as they end up in the hashbag, but if they do change then the hashbag has no guarantees on returning the last item inserted, so using left-right
original extend provides more atomic functionality.
I realize this comment straddles a bit between evmap
and left-right
but I hope it's fine anyway as the problem bridges a bit the 2.
@dialtone Ah, I think maybe you have a different impression of how evmap
works than how it actually works. It doesn't constantly create a new map for writes, it continues to use the same two maps over and over, and just shifts readers from one to the other, with the writer carefully choosing when it's safe to modify each.
The rollback being discussed here is specifically about what you do if you enqueue a number of operations and then want to void them instead of calling publish
. In that case, rolling back is pretty easy: simply truncate the oplog down to the operations that are already present in the read map (self.swap_index
). I think that's all that's needed here. Not sure if that addresses the use-case you have in mind though?
@jonhoo fwiw, the rollback I was thinking above is slightly more complex than just tossing away the oplog, but after publish
, being able to revert back to the previous published state that existed before the latest call to publish
in addition to the current oplog.
Or at least noting that the way that writes are staggered seems to make that possible.
Yeah, I think it might be possible to roll back the last publish
as well either by:
rollback
wait for readers to depart, swap again immediately (and wait), then clear the write map, then copy in all elements from the reader map (which is now the map from before the prior publish); or2 is far more complex, but also much more efficient!
@jonhoo hey, I think I may have explained myself badly, sorry about that :).
Yeah I understand evmap
always uses the same 2 maps, and in fact it reuses the same pool of linked values as well. My point was simply to explain a usecase in which transactionality of changes was achieved by simply re-creating the whole map each time. This can be avoided if it were possible to queue up a bunch of changes that one could rollback if, for whatever reason, the process of updating the chunk of keys didn't go successfully.
When looking at how the
refresh
mechanism works on stream, it ocurred to me that it seems like a (single level)rollback
mechanism seems like it would be possible to implement, by:stale
map,read
map,stale
map, as the new write map,oplog
,I think the
first
/second
drop seems tricky though, because you would need to perform thesecond
drop, for the items in the oplog. (i.e. you want to actually drop the values which are being rolled back, but don't exist in thestale
map, while not performing the second drop on items in thestale
map).There was a bit of musing on uses for
non-collection
evmaps and what they are useful for. I think the addition of rollback could make them more obviously useful.This is not really something I need, but seemed not impossible to implement