Closed joeey1984ad closed 8 years ago
@joeey1984ad Give me the configuration of writeboost.
'dmsetup message mylv 2 writeback_threshold 70 nr_max_batched_writeback 32 read_cache_threshold 1'
BTW changing 'writeback_threshold 70' to 1 or 0 increases performance.
Also I made a seperate table with 80GB Data ( around 350000000 rows) and the test was normal. It seems when a table grows higher than 100GB or so performance goes down.
@joeey1984ad performance analysis is too difficult from remote site. I can give you advices but I can't see what's going on here.
I don't know about the workload (how it submits ios to the block layer)
btw, read_cache_threshold isn't a boolean. please read the doc.
I repeated the test several times just to make sure. Also I have been using Lvm-cache with mysql for many years and have been testing with 1GB DATA and more, the performance degradation is gradual.
Do you think, this has something to do with log structure nature writeboost ?
To analyze, you should determine which block request is slow in what situation.
how?
Do you think, this has something to do with log structure nature writeboost ?
No. Rather, I think writeboost's log-structured nature can give same acceleration to the different datasets.
You should see the sysstat first. And what is happening when the iops is going down
Anyway, I felt like I had to post this issue since I did not notice this problem with other cache blocks devices.
Maybe, I can reproduce the same problem in my environment because the dataset is 400GB and we have free softwares that emulate OLTP workloads (such as dbench)
Sure .. that may help clarify the issue.
The speed is very good even higher than LVM cache up to around 80GB table size. But the TPS dramatically slows down. I am sure something is causing that huge drop.
@joeey1984ad Can you try this command before you start the workload?
dmsetup suspend mylv
dmsetup resume mylv
dmsetup message mylv 0 drop_caches
I guess writeback is happening and that's the reason of slowing down. I can't be sure because you didn't share all you actually did.
Also, see how writeback is going by the sysstat. how about writing to the caching device?
it's a mantra to drop all dirty caches
Ok, I will try dmsetup suspend mylv dmsetup resume mylv dmsetup message mylv 0 drop_caches
Just to be sure no Data was dropped in neither of the block device caches I tried. All I did was to 1.create the block device 2.Create DATA with sysbench 3.test
What do you mean writing to the caching device? you mean writing to SSD alone?
Meanwhile will you please try to replicate the test on your workload?
@joeey1984ad
did you run dropping caches between 2 and 3?
Just to be sure no Data was dropped in neither of the block device caches I tried.
This is strange because you set writeback_threshold to 70
no, I did not drop cache in neither of LVM, BCACHE nor Writeboost. I will try your method and let you know.
What do you mean writing to the caching device?
By seeing the sysstat you can see how much write per sec is done to the caching device
no, in neither of LVM, BCACHE nor Writeboost. I will try your method and let you know.
How confusing. Let's talk about writeboost alone
@akiradeveloper I am just saying that the test I have done was the same throughout all cache devices.
Checking the sysstat might help but that alone would not tell you why such a dramatic slow down.
Ok I have done what was suggested.
1.create the block device 2.Create DATA with sysbench 3.drop cache dmsetup suspend mylv, dmsetup resume mylv, dmsetup message mylv 0 drop_caches 4.test
Same result.
How was the sysstat when it slowed down?
You should find the clear boundary of drastically dropping the TPS and compare the two on the other sides of the boundary.
Checking the sysstat might help
Not just checking. Experienced engineer can infer what is happening in the system from the sysstat.
Give me the dmsetup status after the test
sorry i am not that experienced in sysstat area! maybe if we can test on your end?
@joeey1984ad so it's a good chance to analyze why writeboost slows down. btw, give me the dmsetup status
which may tells me the reason.
Ok, I have to setup the whole thing again and write the 400 GB DATA back. It may take a while. I will provide both 'dmsetup status' and 'sysstat' during sysbench test. Also I think it is best to do an independent test on your end.
I have a question regarding block size 4k? Is there an option to raise the block size to a biger number. In LVM cache such an option exist. It is called chunk size. I know that if my chuck size is small in LVM the performance for mysql gets very bad. I was wondering if may be the same thing is happening here.
Is there an option to raise the block size to a biger number.
no. it's meaningless.
btw, are you running this test on VM?
I will explain why cache block size is 4KB fixed in writeboost and why it's meaningless to change the value
In log-structured caching, the term cache block is different from other mediocre caching software that just reinvents set associative caching.
No i have a dedicated Server 64GB Ram E5-1630 v3
56GB to the buffer pool out of total 64GB sounds too much for me.
and I don't get why you trying to optimize block ios while you are trying to optimize the workload at the MySQL level.
Did you run the test without any block caching driver?
btw
Test1. All 400GB DATA (5990000000 Rows)
sysbench --test=oltp --db-driver=mysql --oltp-test-mode=complex --oltp-table-size=350000 00 --mysql-db=test1 --mysql-user=root --mysql-password= --max-time=60 --oltp-read-only=off --max-requests=0 --num-threa ds=56 run
why is the oltp-table-size so small?
56GB is for testing purposes and there is plenty of ram available.
You mean running mysql on a single HDD without block cache?!
That will be crazy slow on a large database. I have tried it before.
400 DATA on a single HDD. Performance is exactly like writeboost - around 15-30 TPS
On LVM and Bcache, TPS is around 1700-2000
That was a typo. For the All 400GB DATA should be as follow
sysbench --test=oltp --db-driver=mysql --oltp-test-mode=complex --oltp-table-size=5990000000 --mysql-db=test1 --mysql-user=root --mysql-password= --max-time=60 --oltp-read-only=off --max-requests=0 --num-threa ds=56 run
Why the results slow down as the dataset gets larger? explain
Because the amount of data that mysql needs is much faster than HDD IOPS. That is why the larger the database the slower the TPS. For example if I create 5000000 rows ( 1GB) table on a single HDD. The performance is quite high. Around 40000 TPS. Same test running only 5000000 rows (1GB) on a 5990000000 Row (400GB DATA) table, the TPS drops to 20000 for HDD. If I run 5990000000 rows (400GB ) on 5990000000 Row (400GB DATA) table, the TPS drops to ONLY 30 for HDD.
If I run 5990000000 rows (400GB) on 5990000000 Row (400GB DATA) table, the TPS drops to 2000 for BLOCK CACHE DEVICE. The whole point of using block cache is to increase the IO for large database.
the amount of data that mysql needs is much faster than HDD IOPS
Sorry, type mismatch... Don't compare two values that have different types
@akiradeveloper I don't know what you mean!!!!
BTW changing 'writeback_threshold 70' to 1 or 0 increases performance.
How it increased?
oh yes, if i change from 70 to 1. the performance gets much better.
on a 5000000 table. your writeboost with 'writeback_threshold 1' does a very high 4200 TPSon only 10MB buffer pool. Same test with 'writeback_threshold 70' TPS is 3000.
For comparison reasons same test with LVM is 3700 TPS.
But unfortunately witeboost fails on a larger database 100GB or more. the performance becomes like a HDD.
Why do you set read_cache_threshold 1?
changing read_cache_threshold did not do any performance issues for me. I just kept like writeback_threshold.
did you try read_cache_threshold 0?
read_cache_threshold (int) [Experimental]
accepts: 0..127
default: 0 (read caching disabled)
More than $read_cache_threshold * 4KB consecutive reads won't be staged.
Hello, I was testing writeboost with Percona 5.5 and sysbenck 0.4. The performance is great up until around 10GB DATA. The performance quicly diminishes to a very low TPS. I repeated same test with LVM-Cache and Bcache but TPS drops down 'steadily' Not 'quicly'. Is this due the block size ( and if so can it be changed? ) or is this a normal behavior?
BTW. My Buffer Pool is quite high 56GB. My Cache SSD is 430GB.
I created a 400GB DATA with sysbench 0.4 like so,
sysbench --test=oltp --oltp-table-size=5990000000 --mysql-db=test1 --db-driver=mysql --mysql-user=root --mysql-password= prepare
Test is a complex R/W. Results:
Test1. All 400GB DATA (5990000000 Rows)
sysbench --test=oltp --db-driver=mysql --oltp-test-mode=complex --oltp-table-size=5990000000 --mysql-db=test1 --mysql-user=root --mysql-password= --max-time=60 --oltp-read-only=off --max-requests=0 --num-threa ds=56 run sysbench 0.4.12: multi-threaded system evaluation benchmark
OLTP test statistics: queries performed: read: 17080 write: 6100 other: 2440 total: 25620 transactions: 1220 (15.53 per sec.) deadlocks: 0 (0.00 per sec.) read/write requests: 23180 (808.07 per sec.) other operations: 2440 (85.06 per sec.)
Test2. around 80Gb DATA (350000000 Rows) with the same 400GB table
sysbench --test=oltp --db-driver=mysql --oltp-test-mode=complex --oltp-table-size=350000000 --mysql-db=test1 --mysql-user=root --mysql-password= --max-time=20 --oltp-read-only=off --max-requests=0 --num-threads=56 run sysbench 0.4.12: multi-threaded system evaluation benchmark
OLTP test statistics: queries performed: read: 9198 write: 3285 other: 1314 total: 13797 transactions: 657 (30.93 per sec.) deadlocks: 0 (0.00 per sec.) read/write requests: 12483 (587.66 per sec.) other operations: 1314 (61.86 per sec.)
Teest3. Around 1GB DATA (5000000 Rows) with the same 400GB table
sysbench --test=oltp --db-driver=mysql --oltp-test-mode=complex --oltp-table-size=5000000 --mysql-db=test1 --mysql-user=root --mysql-password= --max-time=60 --oltp-read-only=off --max-requests=0 --num-threads=56 run
OLTP test statistics: queries performed: read: 1128106 write: 402895 other: 161158 total: 1692159 transactions: 80579 (4027.58 per sec.) deadlocks: 0 (0.00 per sec.) read/write requests: 1531001 (76523.94 per sec.) other operations: 161158 (8055.15 per sec.)
The results with LVM-Cache and Bcache With same hardware and Mysql configuration is pporximately Test1, 1700-2000 TPS Test2, 2200-2800 TPS Test3, 4400-4500 TPS