Closed HouzuoGuo closed 10 years ago
I made a quick documentation on it: https://drive.google.com/folderview?id=0B9CkwEErGkG1WVVmZkJHMXNackU&usp=sharing ("docs" folder)
As I said to you, I'm not an expert in database but I'm in touch with people who deal with evy bandwidth and concurrent access. It sounds to me like databases are facing the same kind of problems than big storage. ;-)
Tell me how it sounds to you. I hope it will help.
Regards, Alexandre Stein
Hello Alex.
Those materials are indeed very helpful, thank you very much!
I am also studying Mongo's queue implementation and see if it helps.
I wrote a key-value db based on B+Tree in Java, maybe it's helpful But don't know if you can read Chinese. I wrote the code comments and doc in Chinese. I'll take time to rewrite the doc and translate it into English.
project location: https://code.google.com/p/jue/source/browse#svn%2Ftrunk%2Fjue%2Fsrc%2Fcom%2Fgooglecode%2Fjue doc: https://drive.google.com/folderview?id=0B2aM_tq3elwXMjJlMjkyMzktNjFmYS00YjBjLWIyYzEtMzUxYWNlZTM2NjNl&usp=sharing
Thanks.
Hi Houzuo, I use three data sync strategies in my project now. It's like Redis.
Maybe you can use this as reference.
Sorry, clicked wrong button, made a mistake.
Thanks mate, what about:
I am not sure if the second point is useful or not, but the first point has been implemented: bfbc18d84b8e78e1b32bdf5bad5b3ba3e448d404
In the meanwhile, I have not forgotten IO queue! Combining with another known issue https://github.com/HouzuoGuo/tiedot/issues/39 I now have a new idea...
Our experience tells us that tiedot scales well enough from 1 to 4 threads, and it takes advantage of hyperthreading as well. However, going beyond 8 threads does not improve performance at all.
And we know that:
So how about:
That will give us several advantages:
But there is a disadvantage - the "tiedot new generation" will no longer be compatible with the current tiedot.
What do you think?
I think this is a VERY VERY good idea. :+1:
In my cases the retrocompatibility is not a problem. Offline procedure and other tricks are OK to me.
I don't have much time to play with Tiedot this time but I still can do some benchmarks for you.
Kind regards
Good idea! :+1: Collection operations are more like Map-Reduce, right? Maybe you can also write a tool to convert old tiedot file to new generation file.
OK, let's go ahead with that.
Good news - the new idea works pretty well! I setup proof of concept benchmarks in nextgen branch, and the scalability of simple document CRUD operations went beyond my expectation. I am still working on improving hash scan performance, UID operations have really poor performance at the moment.
Stay tuned.
Maybe use B+Tree instead of hash could help improve UID operations. This is an implementation of B+Tree for Go you could try: https://github.com/cznic/b
Will do...
Just an update on this issue:
By partitioning collection index/data and executing IO operations in parallel, tiedot still cannot achieve linear scalability (no speed up past 12 cores).
But don't worry - the "nextgen" branch brings a totally different approach (sorry - it happened again). It borrows the "single thread" idea from Redis, plus the coordination mechanism used by Cassandra, and tiedot runs each collection partition in an independent process, uses Unix domain socket as IPC mechanism.
Let's see...
I have experimented with different implementations of using message passing mechanism to implement collection partitions, however none of them brought any performance improvement at all. Perhaps I could use more HPC programming exercises in Go. For now, I do not think that IO queueing will help improving tiedot scalability. Let's leave this one for later, ok?
Investigate how introducing IO queues can benefit tiedot's operation.
The IO queues will operate on: