Level / classic-level

An abstract-level database backed by LevelDB.
MIT License
58 stars 11 forks source link

Compacting the database on close: advice or feature request #59

Closed aaclayton closed 1 year ago

aaclayton commented 1 year ago

Context

Our application opens and closes databases during its lifecycle. When we are done working with a certain database we would like to compact it so that it is closed in the most disk-space efficient format possible. It appears to me that the database is compacted when it is first opened, but not when it is closed.

Feature Request

Ideally, the db.close() method would support an option for doing this as part of the close workflow, something like:

await db.close({compact: true});

Other Solutions

We tried to perform compaction manually but encountered some unexpected outcomes and consequences. If the above feature request is not viable some advice would be helpful. My thought was to create a function that would compact the entire key range of the DB, as follows:

  /**
   * Compact the entire database.
   * See https://github.com/Level/classic-level#dbcompactrangestart-end-options-callback
   * @returns {Promise<void>}
   */
  async compactFull() {
    const i0 = this.keys({limit: 1, fillCache: false});
    const k0 = await i0.next();
    const i1 = this.keys({limit: 1, reverse: true, fillCache: false});
    const k1 = await i1.next();
    return this.compactRange(k0, k1, {keyEncoding: "utf8"});
  }

Unfortunately, calling this method is producing the opposite effect of what I had anticipated - disk utilization is nearly doubled. Here are the file sizes I measured:

// db.open()
1,088,289 bytes

// Modify some records
1,089,415 bytes

// Call compactFull()
2,245,000 bytes

// db.close()
2,245,000 bytes

// db.open()
1,156,988 bytes

// Modify some records
1,157,189 bytes

// Call db.compactFull()
2,244,613 bytes

// db.close()
2,244,613 bytes

// db.open()
1,088,522 bytes

Is there a flaw in the way I am using the compactRange method? Is there a misunderstanding of what should happen when compacting a range? Is there some other way to solve our use case?

Thank you for guidance!

vweevers commented 1 year ago

await db.close({compact: true})

That would require a new LevelDB feature, over at google/leveldb, so that feature request is out of scope here.

Unfortunately, calling this method is producing the opposite effect of what I had anticipated

This may be because you have unclosed iterators (not having called it.close() before compactRange(), at least judging by your compactFull() code example). Each iterator holds a snapshot of the db and this is known to affect compaction (but I don't know the extent of that effect). I don't have an answer for why it would double the disk utilization.

You can also try ClassicLevel.repair() which implicitly does a compaction too.

aaclayton commented 1 year ago

Hi @vweevers thanks for the response and advice. I'll explore this and follow-up with my findings!

aaclayton commented 1 year ago

Hi @vweevers a follow-up. You were correct that this was due to unclosed iterators. Thanks for noticing that problem in my method. The following is now working well in my use case:

  /**
   * Compact the entire database.
   * See https://github.com/Level/classic-level#dbcompactrangestart-end-options-callback
   * @returns {Promise<void>}
   */
  async compactFull() {
    const i0 = this.keys({limit: 1, fillCache: false});
    const k0 = await i0.next();
    await i0.close();
    const i1 = this.keys({limit: 1, reverse: true, fillCache: false});
    const k1 = await i1.next();
    await i1.close();
    return this.compactRange(k0, k1, {keyEncoding: "utf8"});
  }

Thanks for your guidance. I may open an upstream issue with google/leveldb, although I wouldn't expect it to get any traction as I understand that project is stable and locked for new features.