akiradeveloper / dm-writeboost

Log-structured Caching for Linux
GNU General Public License v2.0
120 stars 19 forks source link

Question regarding writeboost and MYSQL #146

Closed joeey1984ad closed 8 years ago

joeey1984ad commented 8 years ago

Hello, I was testing writeboost with Percona 5.5 and sysbenck 0.4. The performance is great up until around 10GB DATA. The performance quicly diminishes to a very low TPS. I repeated same test with LVM-Cache and Bcache but TPS drops down 'steadily' Not 'quicly'. Is this due the block size ( and if so can it be changed? ) or is this a normal behavior?

BTW. My Buffer Pool is quite high 56GB. My Cache SSD is 430GB.

I created a 400GB DATA with sysbench 0.4 like so,

sysbench --test=oltp --oltp-table-size=5990000000 --mysql-db=test1 --db-driver=mysql --mysql-user=root --mysql-password= prepare

Test is a complex R/W. Results:

Test1. All 400GB DATA (5990000000 Rows)

sysbench --test=oltp --db-driver=mysql --oltp-test-mode=complex --oltp-table-size=5990000000 --mysql-db=test1 --mysql-user=root --mysql-password= --max-time=60 --oltp-read-only=off --max-requests=0 --num-threa ds=56 run sysbench 0.4.12: multi-threaded system evaluation benchmark

OLTP test statistics: queries performed: read: 17080 write: 6100 other: 2440 total: 25620 transactions: 1220 (15.53 per sec.) deadlocks: 0 (0.00 per sec.) read/write requests: 23180 (808.07 per sec.) other operations: 2440 (85.06 per sec.)


Test2. around 80Gb DATA (350000000 Rows) with the same 400GB table

sysbench --test=oltp --db-driver=mysql --oltp-test-mode=complex --oltp-table-size=350000000 --mysql-db=test1 --mysql-user=root --mysql-password= --max-time=20 --oltp-read-only=off --max-requests=0 --num-threads=56 run sysbench 0.4.12: multi-threaded system evaluation benchmark

OLTP test statistics: queries performed: read: 9198 write: 3285 other: 1314 total: 13797 transactions: 657 (30.93 per sec.) deadlocks: 0 (0.00 per sec.) read/write requests: 12483 (587.66 per sec.) other operations: 1314 (61.86 per sec.)


Teest3. Around 1GB DATA (5000000 Rows) with the same 400GB table

sysbench --test=oltp --db-driver=mysql --oltp-test-mode=complex --oltp-table-size=5000000 --mysql-db=test1 --mysql-user=root --mysql-password= --max-time=60 --oltp-read-only=off --max-requests=0 --num-threads=56 run

OLTP test statistics: queries performed: read: 1128106 write: 402895 other: 161158 total: 1692159 transactions: 80579 (4027.58 per sec.) deadlocks: 0 (0.00 per sec.) read/write requests: 1531001 (76523.94 per sec.) other operations: 161158 (8055.15 per sec.)

The results with LVM-Cache and Bcache With same hardware and Mysql configuration is pporximately Test1, 1700-2000 TPS Test2, 2200-2800 TPS Test3, 4400-4500 TPS

akiradeveloper commented 8 years ago

@joeey1984ad Give me the configuration of writeboost.

joeey1984ad commented 8 years ago

'dmsetup message mylv 2 writeback_threshold 70 nr_max_batched_writeback 32 read_cache_threshold 1'

BTW changing 'writeback_threshold 70' to 1 or 0 increases performance.

joeey1984ad commented 8 years ago

Also I made a seperate table with 80GB Data ( around 350000000 rows) and the test was normal. It seems when a table grows higher than 100GB or so performance goes down.

akiradeveloper commented 8 years ago

@joeey1984ad performance analysis is too difficult from remote site. I can give you advices but I can't see what's going on here.

akiradeveloper commented 8 years ago

I don't know about the workload (how it submits ios to the block layer)

btw, read_cache_threshold isn't a boolean. please read the doc.

joeey1984ad commented 8 years ago

I repeated the test several times just to make sure. Also I have been using Lvm-cache with mysql for many years and have been testing with 1GB DATA and more, the performance degradation is gradual.

Do you think, this has something to do with log structure nature writeboost ?

akiradeveloper commented 8 years ago

To analyze, you should determine which block request is slow in what situation.

joeey1984ad commented 8 years ago

how?

akiradeveloper commented 8 years ago

Do you think, this has something to do with log structure nature writeboost ?

No. Rather, I think writeboost's log-structured nature can give same acceleration to the different datasets.

akiradeveloper commented 8 years ago

You should see the sysstat first. And what is happening when the iops is going down

joeey1984ad commented 8 years ago

Anyway, I felt like I had to post this issue since I did not notice this problem with other cache blocks devices.

akiradeveloper commented 8 years ago

Maybe, I can reproduce the same problem in my environment because the dataset is 400GB and we have free softwares that emulate OLTP workloads (such as dbench)

joeey1984ad commented 8 years ago

Sure .. that may help clarify the issue.

joeey1984ad commented 8 years ago

The speed is very good even higher than LVM cache up to around 80GB table size. But the TPS dramatically slows down. I am sure something is causing that huge drop.

akiradeveloper commented 8 years ago

@joeey1984ad Can you try this command before you start the workload?

dmsetup suspend mylv
dmsetup resume mylv
dmsetup message mylv 0 drop_caches

I guess writeback is happening and that's the reason of slowing down. I can't be sure because you didn't share all you actually did.

Also, see how writeback is going by the sysstat. how about writing to the caching device?

akiradeveloper commented 8 years ago

it's a mantra to drop all dirty caches

joeey1984ad commented 8 years ago

Ok, I will try dmsetup suspend mylv dmsetup resume mylv dmsetup message mylv 0 drop_caches

Just to be sure no Data was dropped in neither of the block device caches I tried. All I did was to 1.create the block device 2.Create DATA with sysbench 3.test

What do you mean writing to the caching device? you mean writing to SSD alone?

Meanwhile will you please try to replicate the test on your workload?

akiradeveloper commented 8 years ago

@joeey1984ad

did you run dropping caches between 2 and 3?

akiradeveloper commented 8 years ago

Just to be sure no Data was dropped in neither of the block device caches I tried.

This is strange because you set writeback_threshold to 70

joeey1984ad commented 8 years ago

no, I did not drop cache in neither of LVM, BCACHE nor Writeboost. I will try your method and let you know.

akiradeveloper commented 8 years ago

What do you mean writing to the caching device?

By seeing the sysstat you can see how much write per sec is done to the caching device

akiradeveloper commented 8 years ago

no, in neither of LVM, BCACHE nor Writeboost. I will try your method and let you know.

How confusing. Let's talk about writeboost alone

joeey1984ad commented 8 years ago

@akiradeveloper I am just saying that the test I have done was the same throughout all cache devices.

joeey1984ad commented 8 years ago

Checking the sysstat might help but that alone would not tell you why such a dramatic slow down.

joeey1984ad commented 8 years ago

Ok I have done what was suggested.

1.create the block device 2.Create DATA with sysbench 3.drop cache dmsetup suspend mylv, dmsetup resume mylv, dmsetup message mylv 0 drop_caches 4.test

Same result.

akiradeveloper commented 8 years ago

How was the sysstat when it slowed down?

akiradeveloper commented 8 years ago

You should find the clear boundary of drastically dropping the TPS and compare the two on the other sides of the boundary.

akiradeveloper commented 8 years ago

Checking the sysstat might help

Not just checking. Experienced engineer can infer what is happening in the system from the sysstat.

akiradeveloper commented 8 years ago

Give me the dmsetup status after the test

joeey1984ad commented 8 years ago

sorry i am not that experienced in sysstat area! maybe if we can test on your end?

akiradeveloper commented 8 years ago

@joeey1984ad so it's a good chance to analyze why writeboost slows down. btw, give me the dmsetup status which may tells me the reason.

joeey1984ad commented 8 years ago

Ok, I have to setup the whole thing again and write the 400 GB DATA back. It may take a while. I will provide both 'dmsetup status' and 'sysstat' during sysbench test. Also I think it is best to do an independent test on your end.

joeey1984ad commented 8 years ago

I have a question regarding block size 4k? Is there an option to raise the block size to a biger number. In LVM cache such an option exist. It is called chunk size. I know that if my chuck size is small in LVM the performance for mysql gets very bad. I was wondering if may be the same thing is happening here.

akiradeveloper commented 8 years ago

Is there an option to raise the block size to a biger number.

no. it's meaningless.

btw, are you running this test on VM?

akiradeveloper commented 8 years ago

I will explain why cache block size is 4KB fixed in writeboost and why it's meaningless to change the value

  1. the actual write io to the caching device is 512KB
  2. according to the Amdahl's law, accessing the block on read-hit in 4KB or larger is trivial considering CPU and memory is much faster than storage and SSD is much faster than HDD.

In log-structured caching, the term cache block is different from other mediocre caching software that just reinvents set associative caching.

joeey1984ad commented 8 years ago

No i have a dedicated Server 64GB Ram E5-1630 v3

akiradeveloper commented 8 years ago

56GB to the buffer pool out of total 64GB sounds too much for me.

and I don't get why you trying to optimize block ios while you are trying to optimize the workload at the MySQL level.

Did you run the test without any block caching driver?

akiradeveloper commented 8 years ago

btw

Test1. All 400GB DATA (5990000000 Rows)

sysbench --test=oltp --db-driver=mysql --oltp-test-mode=complex --oltp-table-size=350000 00 --mysql-db=test1 --mysql-user=root --mysql-password= --max-time=60 --oltp-read-only=off --max-requests=0 --num-threa ds=56 run

why is the oltp-table-size so small?

joeey1984ad commented 8 years ago

56GB is for testing purposes and there is plenty of ram available.

You mean running mysql on a single HDD without block cache?!

That will be crazy slow on a large database. I have tried it before.

400 DATA on a single HDD. Performance is exactly like writeboost - around 15-30 TPS

On LVM and Bcache, TPS is around 1700-2000

joeey1984ad commented 8 years ago

That was a typo. For the All 400GB DATA should be as follow

sysbench --test=oltp --db-driver=mysql --oltp-test-mode=complex --oltp-table-size=5990000000 --mysql-db=test1 --mysql-user=root --mysql-password= --max-time=60 --oltp-read-only=off --max-requests=0 --num-threa ds=56 run

akiradeveloper commented 8 years ago

Why the results slow down as the dataset gets larger? explain

joeey1984ad commented 8 years ago

Because the amount of data that mysql needs is much faster than HDD IOPS. That is why the larger the database the slower the TPS. For example if I create 5000000 rows ( 1GB) table on a single HDD. The performance is quite high. Around 40000 TPS. Same test running only 5000000 rows (1GB) on a 5990000000 Row (400GB DATA) table, the TPS drops to 20000 for HDD. If I run 5990000000 rows (400GB ) on 5990000000 Row (400GB DATA) table, the TPS drops to ONLY 30 for HDD.

If I run 5990000000 rows (400GB) on 5990000000 Row (400GB DATA) table, the TPS drops to 2000 for BLOCK CACHE DEVICE. The whole point of using block cache is to increase the IO for large database.

akiradeveloper commented 8 years ago

the amount of data that mysql needs is much faster than HDD IOPS

Sorry, type mismatch... Don't compare two values that have different types

joeey1984ad commented 8 years ago

@akiradeveloper I don't know what you mean!!!!

akiradeveloper commented 8 years ago

BTW changing 'writeback_threshold 70' to 1 or 0 increases performance.

How it increased?

joeey1984ad commented 8 years ago

oh yes, if i change from 70 to 1. the performance gets much better.

on a 5000000 table. your writeboost with 'writeback_threshold 1' does a very high 4200 TPSon only 10MB buffer pool. Same test with 'writeback_threshold 70' TPS is 3000.

For comparison reasons same test with LVM is 3700 TPS.

But unfortunately witeboost fails on a larger database 100GB or more. the performance becomes like a HDD.

akiradeveloper commented 8 years ago

Why do you set read_cache_threshold 1?

joeey1984ad commented 8 years ago

changing read_cache_threshold did not do any performance issues for me. I just kept like writeback_threshold.

akiradeveloper commented 8 years ago

did you try read_cache_threshold 0?

akiradeveloper commented 8 years ago
read_cache_threshold (int) [Experimental]
  accepts: 0..127
  default: 0 (read caching disabled)
More than $read_cache_threshold * 4KB consecutive reads won't be staged.