Investigate IO queuing and its benefits

HouzuoGuo commented 10 years ago

Investigate how introducing IO queues can benefit tiedot's operation.

The IO queues will operate on:

document CRUD
hash bucket CRU

alexandrestein commented 10 years ago

I made a quick documentation on it: https://drive.google.com/folderview?id=0B9CkwEErGkG1WVVmZkJHMXNackU&usp=sharing ("docs" folder)

As I said to you, I'm not an expert in database but I'm in touch with people who deal with evy bandwidth and concurrent access. It sounds to me like databases are facing the same kind of problems than big storage. ;-)

Tell me how it sounds to you. I hope it will help.

Regards, Alexandre Stein

HouzuoGuo commented 10 years ago

Hello Alex.

Those materials are indeed very helpful, thank you very much!

I am also studying Mongo's queue implementation and see if it helps.

NoahShen commented 10 years ago

I wrote a key-value db based on B+Tree in Java, maybe it's helpful But don't know if you can read Chinese. I wrote the code comments and doc in Chinese. I'll take time to rewrite the doc and translate it into English.

project location: https://code.google.com/p/jue/source/browse#svn%2Ftrunk%2Fjue%2Fsrc%2Fcom%2Fgooglecode%2Fjue doc: https://drive.google.com/folderview?id=0B2aM_tq3elwXMjJlMjkyMzktNjFmYS00YjBjLWIyYzEtMzUxYWNlZTM2NjNl&usp=sharing

Thanks.

NoahShen commented 10 years ago

Hi Houzuo, I use three data sync strategies in my project now. It's like Redis.

Always sync, sync every time when writing new data. Very slow but very safe.
Buffer sync, never sync unless buffer is full or the file is closed. Very fast, less safe.
Sync every sec, only sync data when latest sync time is more than 1 sec from now.

Maybe you can use this as reference.

NoahShen commented 10 years ago

Sorry, clicked wrong button, made a mistake.

HouzuoGuo commented 10 years ago

Thanks mate, what about:

Having an HTTP API to flush all buffers
Use a command line parameter to configure how often buffers are flushed automatically, by default it is 1 minute.

HouzuoGuo commented 10 years ago

I am not sure if the second point is useful or not, but the first point has been implemented: bfbc18d84b8e78e1b32bdf5bad5b3ba3e448d404

In the meanwhile, I have not forgotten IO queue! Combining with another known issue https://github.com/HouzuoGuo/tiedot/issues/39 I now have a new idea...

Our experience tells us that tiedot scales well enough from 1 to 4 threads, and it takes advantage of hyperthreading as well. However, going beyond 8 threads does not improve performance at all.

And we know that:

goroutines are extremely lightweight, it does not hurt performance by spawning 20k of them
go channels scale almost linearly as number of hardware threads goes up

So how about:

Split each collection into 16MB chunks
Each chunk maintains its own index structure and data file
Each chunk has its own IO queue (a channel) and a single thread working on CRUD operations
All collection operations (including queries) are sent to a central IO queue (a channel), then dispatched to chunk(s) to work on

That will give us several advantages:

Even better data safety - every chunk can function independently - in case of a broken index structure from one chunk, others can still function at their maximum availability.
Better scalability - independent chunks do not interfere with one another's locks

But there is a disadvantage - the "tiedot new generation" will no longer be compatible with the current tiedot.

What do you think?

alexandrestein commented 10 years ago

I think this is a VERY VERY good idea. :+1:

In my cases the retrocompatibility is not a problem. Offline procedure and other tricks are OK to me.

I don't have much time to play with Tiedot this time but I still can do some benchmarks for you.

Kind regards

NoahShen commented 10 years ago

Good idea! :+1: Collection operations are more like Map-Reduce, right? Maybe you can also write a tool to convert old tiedot file to new generation file.

HouzuoGuo commented 10 years ago

OK, let's go ahead with that.

HouzuoGuo commented 10 years ago

Good news - the new idea works pretty well! I setup proof of concept benchmarks in nextgen branch, and the scalability of simple document CRUD operations went beyond my expectation. I am still working on improving hash scan performance, UID operations have really poor performance at the moment.

Stay tuned.

alexandrestein commented 10 years ago

Maybe use B+Tree instead of hash could help improve UID operations. This is an implementation of B+Tree for Go you could try: https://github.com/cznic/b

HouzuoGuo commented 10 years ago

Will do...

HouzuoGuo commented 10 years ago

Just an update on this issue:

By partitioning collection index/data and executing IO operations in parallel, tiedot still cannot achieve linear scalability (no speed up past 12 cores).

But don't worry - the "nextgen" branch brings a totally different approach (sorry - it happened again). It borrows the "single thread" idea from Redis, plus the coordination mechanism used by Cassandra, and tiedot runs each collection partition in an independent process, uses Unix domain socket as IPC mechanism.

Let's see...

HouzuoGuo commented 10 years ago

I have experimented with different implementations of using message passing mechanism to implement collection partitions, however none of them brought any performance improvement at all. Perhaps I could use more HPC programming exercises in Go. For now, I do not think that IO queueing will help improving tiedot scalability. Let's leave this one for later, ok?

HouzuoGuo / tiedot

Investigate IO queuing and its benefits #23