sbtourist / Journal.IO

Journal.IO is a zero-dependency, fast and easy-to-use journal storage implementation.
Apache License 2.0
260 stars 39 forks source link

Improve delete performance #30

Closed robshep closed 11 years ago

robshep commented 11 years ago

As per discussion here:

https://groups.google.com/forum/?fromgroups=#!topic/journalio/f7Touwno_PE

There are a few options available for adding asynchronicity to the journal delete operations

the delete method does a sync everytime to ensure you find all the latest writes: if you make it async, you may not find them and get an exception. In your case by the way, it just works because you have all inserts synced too.

There are several options to improve it: 1) Make an async delete version which enforces/checks there are no inflight writes. 2) Make an async delete version which does actual deletes in batches. 3) Mark the locations as deleted when in memory waiting to be batched.

sbtourist commented 11 years ago

Thanks Rob, I'll have a think and get back, feel free to report any feedback btw :)

robshep commented 11 years ago

I will fork and hack this evening on the various options.

I will also put my example code of (negligible workload) producers/consumers in a Gist to empirically evaluate any of the various implementations.

Rob

sbtourist commented 11 years ago

That's great, Rob.

I'll be on vacation most of the week, but I'll answer to emails and issues.

Happy hacking :)

robshep commented 11 years ago

I've been inspecting the code in an attempt to find the relevant code segments where any of the 3 proposed methods can introduce their respective functionality.

I'm not sure I understand the workings entirely - and have become somewhat confused about various aspects of the current implementation.

I believe (whilst being fully prepared to have my beliefs destroyed) that the code already performs a flush of the "Async-Write-Queue" before marking the existing locations as deleted. (See comment (1) in code below)

I think the sync boolean option for Journal.updateLocations(Location location, byte type, boolean sync) only enables the physical sync option. (See comment (2) below)

void updateLocation(Location location, byte type, boolean sync) throws CompactedDataFileException, IOException {
        Lock threadLock = getOrCreateLock(Thread.currentThread(), location.getDataFileId());
        accessorLock.lock();
        threadLock.lock();
        try {
            journal.sync();  // <-- 1. This waits for the async queue to empty?
            //
            RandomAccessFile raf = getOrCreateRaf(Thread.currentThread(), location.getDataFileId());
            if (seekToLocation(raf, location, false)) {
                int pointer = raf.readInt();
                int size = raf.readInt();
                raf.write(type);
                IOHelper.skipBytes(raf, size - Journal.HEADER_SIZE);
                location.setType(type);
                if (sync) {
                    IOHelper.sync(raf.getFD()); // <-- 2. This is a physical sync, determined by boolean sync
                }
            } else {
                throw new IOException("Cannot find location: " + location);
            }
        } finally {
            threadLock.unlock();
            accessorLock.unlock();
        }
    }

Now, in order to test some theories, I took a look at where the physical sync (IOHelper.sync()) is used and it appears that it is not enabled by default.

So it seems that delete() is not "slower" than write() - as shown by my first encounters - just safer.

Here are some numbers for the delete lag that I _first_ saw.

( Harness code is at: https://gist.github.com/4669110 )

Harness:

  1. spawn threads to enqueue N records to a journal.
  2. After writing, each one gives the location to a consumer which is queued to a threadpool to be executed.
  3. when the consumer executes, it reads then deletes the record.

The lag between completing the N rounds of 1&2 and subsequently finishing off all the enqueued N rounds of 3 was measured.

A) journal.setPhysicalSync(false) DEFAULT deletes are phys-sync'ed though

N records: 1000 writers: 1 readers: 1 records size 54 bytes

Time to complete production: 7s Time to complete consumption: 52s Consumption/Production Lag: 45s

then

B) journal.setPhysicalSync(true)

N records: 1000 writers: 1 readers: 1 records size 54 bytes

Time to complete production: 52s Time to complete consumption: 69s Consumption/Production Lag: 16s

furthermore (for raw webscale :) speed) I patched DataFileAccessor.java:70 thusly

-  if (sync) {
-        IOHelper.sync(raf.getFD());
-  }
+ if (journal.isPhysicalSync()) {
+        IOHelper.sync(raf.getFD());
+  }

C) journal.setPhysicalSync(false) no physical sync waiting at all

N records: 1000 writers: 1 readers: 1 records size 54 bytes

Time to complete production: 0s Time to complete consumption: 0s Consumption/Production Lag: 0s

That circumstance reminds me of this... http://www.xtranormal.com/watch/6995033

Please let me know what your thoughts are on what I have managed to understand.

sbtourist commented 11 years ago

Yes, you are correct about your two points above, and yes, Journal.IO codebase is not an easy one to grok :)

By the way, it is not a matter of safety: it is a matter of correctness, because unsynced (I mean Journal.sync) locations will not be found and hence not deleted. If you see things under this light, my three suggestions make probably more sense :)

robshep commented 11 years ago

Apologies Sergio, I think I am missing some level of understanding.

I had gathered that the journal.sync() statement did flush any inflight pending writes before doing any deletions.

Option 1 is already implemented isn't it?

Sorry for any crossed wires.

Thanks though.

Rob

sbtourist commented 11 years ago

On Sat, Feb 2, 2013 at 6:28 PM, robshep notifications@github.com wrote:

Apologies Sergio, I think I am missing some level of understanding.

No worries, happy to answer :)

I had gathered that the journal.sync() statement did flush any inflight pending writes before doing any deletions.

And that's correct... :)

Option 1 is already implemented isn't it?

Nope, it isn't yet, but that's the one I'm going to implement soon.

Stay tuned ;)

sbtourist commented 11 years ago

Guys, I've made some delete performance improvements without impacting APIs, basically starting from option 1 above, here's the relevant commit: https://github.com/sbtourist/Journal.IO/commit/ead66d0c9cbc3060f3f320af1846bcf63501ec61

Here are some performance numbers: 1) Deletes per second with no journal physical sync enabled (default): 20k 2) Deletes per second with journal physical sync enabled: 5k

Feedback wanted :)

robshep commented 11 years ago

Sorry for the late response. I was busy on other projects. Thank you for this enhancement.

Running this on my small evaluation harness provides the values:

before

rev: b73ccf8 N records: 50000 writers: 1 readers: 1 records size 104 bytes physical sync: true

Time to complete production: 3076s Time to complete consumption: 3652s Consumption/Production Lag: 576

after

rev: ead66d0 N records: 50000 writers: 1 readers: 1 records size 104 bytes physical sync: true

Time to complete production: 2970s Time to complete consumption: 2970s Consumption/Production Lag: 0

So, the main benefit for my example here - which mirrors my requirements - is that the code for this commit has decreased the overhead for deleting so much that there is now no lag.

Many thanks for this effort Sergio. We are very grateful.

sbtourist commented 11 years ago

Hi Rob, great to hear! Could you write about that on the mailing list too?

Sergio Bossa Sent by iPhone

Il giorno 19/feb/2013, alle ore 02:30, robshep notifications@github.com ha scritto:

Sorry for the late response. I was busy on other projects. Thank you for this enhancement.

Running this on my small evaluation harness provides the values:

before

rev: b73ccf8 N records: 50000 writers: 1 readers: 1 records size 104 bytes physical sync: true

Time to complete production: 3076s Time to complete consumption: 3652s Consumption/Production Lag: 576

after

rev: ead66d0 N records: 50000 writers: 1 readers: 1 records size 104 bytes physical sync: true

Time to complete production: 2970s Time to complete consumption: 2970s Consumption/Production Lag: 0

So, the main benefit for my example here - which mirrors my requirements - is that the code for this commit has decreased the overhead for deleting so much that there is now no lag.

Many thanks for this effort Sergio. We are very grateful.

— Reply to this email directly or view it on GitHub.