Closed LongTengDao closed 3 years ago
Could you please elaborate? I don't understand what db.batchFirst()
should do or how it'd even know about the second db.
I'm certain though that 2PC should be implemented in userland, as it requires (at minimum) a transport, protocol & coordinator, which is far outside the scope of a single-process level db.
@vweevers
Could you please elaborate? I don't understand what
db.batchFirst()
should do or how it'd even know about the second db.I'm certain though that 2PC should be implemented in userland, as it requires (at minimum) a transport, protocol & coordinator, which is far outside the scope of a single-process level db.
Ok. I mean, when I want to implement multiple db writing as a whole transaction, I need to firstly additionally record all the data I will pass to leveldb api, and then really call the leveldbs api, so that if power off during this period, I can finish the rest via the record when reboot.
But this means double the performance consumption. Since leveldb itself already implements the log-dump ① mechanism within a single database, it means that such a heavy cost can be easily avoided via a slight change of leveldb.
① dump: I try to use leveldb term here, which means moving a data from log to sst.
The thing db.batchFirst()
to do is very like db.batch()
, the difference is that it only log, and never dump without db.second()
called;
if program dead between the two action, and reboot then level('/db')
next time, user can specify level('/db', option)
to decide, giveup the last log, or make it dump-able.
This allows users to implement multi-database transactions without additional record of all behaviour data: when all db.batchFirst()
returns successfully, the userland code simply record true
, so that if the power is lost at the moment , the next time we open the db, we pass option second
, and all log will be dump-able; if this record does not exist, the userland code knows that it failed halfway, then it pass option giveup
when opening the db, all the last time db.batchFirst()
log data will disappear.
I hope I explained it clearly... Thanks for continuing to communicate!
Can you define "dump"? If you mean moving a log to an SST (.ldb
file) this is done when the log reaches a certain size (or upon startup when recovering from a crash), rather than after every batch AFAIK. Which is to say, it can contain multiple batches.
@vweevers
Can you define "dump"? If you mean moving a log to an SST (.ldb file) this is done when the log reaches a certain size (or upon startup when recovering from a crash), rather than after every batch AFAIK. Which is to say, it can contain multiple batches.
Ok. I refined my expression above, it was not rigorous. Yes, dump should be what you mean, and leveldb can retain the log-dump mechanism you said. The operation expected by db.second()
is not really dump internally, but to make the un-dump-able log added by db.batchFirst()
become dump-able (equals to log added by db.batch()
).
In other words, in the traditional mode, thing is like this:
When db.batch()
, leveldb do a log operation. Once successful, the db is equivalent to being successfully modified. As for the internal dump, it can be completely opaque and has nothing to do with the user.
But in the new mode:
When db.batchFirst()
, leveldb do a log operation, but this data is in an un-dump-able state, unless db.second()
or level('/db', 'second')
, otherwise this data will never dump (and disappear when level('/db', 'giveup')
).
moving a log to an SST (
.ldb
file) [..] is done when the log reaches a certain size (or upon startup when recovering from a crash)
To clarify, what I meant to say is that it's outside of our control. This mechanism is implemented in LevelDB.
moving a log to an SST (
.ldb
file) [..] is done when the log reaches a certain size (or upon startup when recovering from a crash)To clarify, what I meant to say is that it's outside of our control. This mechanism is implemented in LevelDB.
Oh! Sorry! I though this repo is the level db implement repo...
With that clarified, it seems there's no further action item here, so I'm closing this.
Let me use JavaScript to describe it.
Currently:
But when I want to do a transaction with another leveldb (or with other
fs.writeFile
), I must record (userland log) the batch myself:Aim (api could be below or things like that):