tellproject / tellstore

A in-memory storage manager, that can do versioning and fast scans
Apache License 2.0
50 stars 9 forks source link

guide to run TPCC benchmark ? #1

Open guowentian opened 8 years ago

guowentian commented 8 years ago

Hi, I am interested in your project. I want to run TPCC benchmark on Tell. It seems to I need to run tellstore together with commitmanager. Any guide to do this? An example would be better for me. Thanks in advance !

mpilman commented 8 years ago

Currently, the documentation is far from complete... We are currently improving the coordination and partitioning, so usage will significantly change in the near future.

To run tpcc, what you would need to do is the following:

Keep in mind, that Tell currently needs an infiniband network in order to work. If you do not have access to an infiniband cluster, you could try to get it running with SoftROCE.

There is also a subproject helper_scripts under tellproject. There are several python scripts that start up a cluster and run a workload. However, these scripts are for internal use - this means that you would have to change them - but most of the changes should be in ServerConfig.py

guowentian commented 8 years ago

Thanks for your instructions. Now I can set up the TPCC and run on my RDMA cluster. I find the helper_script is very useful. I tried to run only Payment+Neworder on 8 nodes, each node with 4 warehosues. The total throughput turns out to be no very good, and I think that is because I didn't set up in the correct way. Do you have any suggestions on how to tune the performance ? For example, the number of clients(currently I use 32 clients), logging(I saw you maintain log entries, do you write them into disk?), or some other factors ? Thanks in advance.

mpilman commented 8 years ago

It really depends how your cluster looks like. You should use 2-3 times as many tpcc-server instances than tellstore nodes. Then there are several things you can do/try:

Are you 100% sure that you compile in release mode? Everything will be quite slow if you run in Debug mode (mostly because of loggin). Furthermore we usually used link time optimization when we built binaries for benchmarks - this gives another ~20% performance boost. GCC seems to generate faster binaries than clang and I think clang and the intel compiler crash when you activate link time optimization (to do so call cmake with -DCMAKE_AR=/usr/bin/gcc-ar -DCMAKE_RANLIB=/usr/bin/gcc-ranlib -DCMAKE_CXX_FLAGS="-march=native -flto -fuse-linker-plugin" -DCMAKE_BUILD_TYPE=Release).

Do your machines have NUMA? In that case you should run one process per NUMA node (you can use numactl to pin them to a specific NUMA node) - NUMA awareness is still something we should build, but currently this has low priority, as the one process per NUMA node works quite well for us. Best practice is, to have all TellStore processes on the same NUMA node where your infiniband card is attached (probably node 0).

We usually ran only one client process in total (the client does not really do much, it mostly sends the queries to the tpcc_server instances) - the tpcc_server instances run TellDB which does all the transaction processing and index processing - this is why they are quite heavy-weight. If my memory serves me right, we used 20 clients per machine. But you might need to play around with these numbers.

Make sure you set the logging level to FATAL - this is especially important for tellstore.

Make sure that you populate enough warehouses. 50 should be enough for your cluster (currently Tell has quite a large memory overhead - this is because our GC only frees memory once per cycle, another thing we should fix - as you see, there is still quite a lot to do).

I hope these pointers help you out - otherwise I might need some more information.

@kbocksrocker Did I forget something? Do you have something to add?

kbocksrocker commented 8 years ago

I don't think I have anything to add to this. Can you provide us with your setup - i.e. how many instances you started and their configuration? Building in Release with Link-Time Optimization also helped quite a bit compared to a debug build.

guowentian commented 8 years ago

Thanks for your instant feedbacks ! Just several more questions I want to ask:

  1. I noticed the LogEntry in your source code, does it means that you did transaction logging during execution, i.e. writing the logs into disk for failure recovery. If it is so, is there any ways to turn it off ?
  2. I saw that there are serveral stores in tellstore, including rowstore, deltastore and log-structured stuffs. Currently I run TPCC with rowstore. Am I right ? Could you briefly describe what is the main difference between these different stores, because I could not find any descriptions about these?

For my set ups, our cluster has 8 nodes; the machine has no NUMA; I run 1 commitmanager, 8 tellstore instance, 8 tpcc_server, 32 clients in total(4 clients for each server instance); I compile in Release type. I also find the problem about memory overheads, so I only use 32 warehouse overall, 20G memory for each node, 10 seconds per GC cycle.

As you suggested, I will try link time optimizaition, more tpcc_server intances. Thanks for your replies :+1:

mpilman commented 8 years ago

You will definitely need more clients. The three stores are:

For now, you could try logstructured. As TPCC does not need scans, it will be the fastest and uses less memory than the others. you can also try to increase the number of get/put threads to two.

TellDB writes log entries to TellStore. We do not provide an option to turn this off. You could comment out the code that does the logging - but in that case transactions are not correctly implemented. Eventually the logs will be needed to roll back transactions if a telldb node fails - this is not yet implemented but we still write the logs to make sure that we measure correctly. So currently nothing would change if you comment this logging code out - but the numbers will be too high and therefore dishonest - it really depends on what you need/want. But for TPCC it should not be too expensive anyway (and insert and one delete per transaction).

Try adding more clients, that should be the main bottleneck. You can also try to load fewer warehouses and have 4 storage nodes and 12 tpcc_servers. In that case the abort rate might increase, but we actually never saw a high abort rate.

kbocksrocker commented 8 years ago

If you are talking about the LogEntry class: That one does not belong to the "process logging" but to the logstructured-memory implementation that we have. It is one of our storage backends where all information is written to an in-memory append-only structure (like a log). It's not related to the way we do transactional-logging which is part of the TellDB client library and not of the store. TellStore does not yet have support for error recovery and writes nothing at all to the disk.

guowentian commented 8 years ago

@kbocksrocker You get my point. Yeah, I was just wondering whether that part is doing the transaction logging. Now I understand: Tell maintains transaction logs during execution, which is only for rollback when transactions abort; the logs are not written into disk, because currently the fault tolerance part is not fully developed. But with these log entries, I think it is not difficult to further support a basic recovery process. @mpilman thanks for your explanation. I just want to know what is the best store to run OLTP benchmark like TPCC. Maybe I will try both rowstore and logstructure.

mpilman commented 8 years ago

don't use rowstore - it is the least robust of our implementation. Use columnmap and logstructured. I would settle for logstructured for now.

guowentian commented 8 years ago

Actually, I find rowstore is the most robust one....When running for too long, or collocating storage nodes and processing nodes, the system will experience sudden crash. I didn't explore the reason yet. I tried logstructure store, but it just stalls when populating the database. I suppose it is because the limited memory of our machine(32G). It just easily approach the limits of our memory.

mpilman commented 8 years ago

The reason that logstore is stalling indicates that you are not allocating enough memory for the hash table (logstore has a different hash table). Try allocate at least 10% of the memory for the hash table (the parameter is in number of elements - but if you look at the ServerConfig.py file you see that we use a different memory config for logstructured - try to take these numbers and scale them down). Logstructured should currently be the least memory hungry storage.

guowentian commented 8 years ago

Hi, I try to run with logstructure store, but tpcc_server fails to launch because of "Error during socket operation [error = generic:111 Connection refused](in handleSocketError at /user/wentian/programs/tell/crossbow/libs/infinio/include/crossbow/infinio/BatchingMessageSocket.hpp:407)".
Any suggestions ? I think I have passed the correct ip address and cannot figure out why.

guowentian commented 8 years ago

the full error output message is as follows: Starting client manager (in ClientManager at /user/wentian/programs/tell/tellstore/tellstore/ClientManager.hpp:305) Connecting to CommitManager server 10.10.11.120:7242 (in connect at /user/wentian/programs/tell/commitmanager/client/ClientSocket.cpp:40) Connecting to TellStore server 10.10.11.119:7241 on processor 0 (in connect at /user/wentian/programs/tell/tellstore/client/ClientSocket.cpp:245) Connecting to CommitManager server 10.10.11.120:7242 (in connect at /user/wentian/programs/tell/commitmanager/client/ClientSocket.cpp:40) Connecting to TellStore server 10.10.11.119:7241 on processor 1 (in connect at /user/wentian/programs/tell/tellstore/client/ClientSocket.cpp:245) Error during socket operation [error = generic:111 Connection refused](in handleSocketError at /user/wentian/programs/tell/crossbow/libs/infinio/include/crossbow/infinio/BatchingMessageSocket.hpp:407) terminate called after throwing an instance of 'std::system_error' what(): Transport endpoint is not connected Aborted (core dumped)

kbocksrocker commented 8 years ago

Sorry for the late response. Are you starting all the processes at once? The CommitManager and especially the TellStore server need a few seconds until they are ready as they need to allocate a lot of memory. Can you start the client only after the tellstored process logs a "Storage ready" message? You might need to set the log level to INFO for tellstored to see the message.