Closed cyb70289 closed 3 years ago
@cyb70289 Sorry for the late reply. Can you tell a bit more about the dataset you used? Like how many files you had. I am guessing you used the 128 MB file I sent you.
Yes, I'm using the 128MB file. I generated 40 files under cephfs.
FYI, I built binaries from your Arrow PR, not from this repo. Build type is Release
.
@cyb70289 These are some results on my setup with 4 OSDs (bare-metals, 16 logical cores, 64 GB DRAM), 3 MONs (bare-metals, 16 logical cores, 64 GB DRAM), and a single client. The network is 10Gb/s and every OSD is on an NVMe drive. Talking about the Ceph configuration, my cephfs_data
pool has 128 PGs with replication turned on to 3 and PG autoscaling turned off.
rados-parquet
root@node0:/users/noobjc# python3 bench.py rpq 1 /mnt/cephfs/dataset 16 result_rpq.json
10.858497619628906
11.692570924758911
11.904498815536499
9.314198017120361
8.072686433792114
5.368509531021118
4.973011493682861
4.362470388412476
parquet
root@node0:/users/noobjc# python3 bench.py pq 1 /mnt/cephfs/dataset 16 result_pq.json
9.049651622772217
11.662115573883057
11.579400539398193
10.566514492034912
10.72178316116333
9.71735692024231
9.325489521026611
9.544288635253906
Can you please check the number of PGs present in your Ceph cluster? I must say, the number of PGs should follow the formula
(OSDs * 100)
Total PGs = ------------
pool size
or else the performance is severely hampered (especially if the PGs are less than it should be according to the formula).
FYI, I built binaries from your Arrow PR, not from this repo. Build type is
Release
.
That should be fine. no major changes have been introduced. But for sanity, you can still try to use arrow build in release mode from arrow-master
.
PG looks good on my cluster (3 osds, 3 replicas, 128 pgs, auto scale is off).
Will try 4 osds as your configuration.
A quick question, should osd op threads = 16
in /etc/ceph/ceph.conf matches your cpu cores?
A quick question, should osd op threads = 16 in /etc/ceph/ceph.conf matches your cpu cores?
Yes, that's right.
Looks it's related to network bandwidth. I limited virtual network to lower bandwidth and see better result from rqp
than pq
for selection <= 25%.
@cyb70289 That sounds great, thanks a lot for sharing! I was interested, To how much did you limit the network?
@cyb70289 That sounds great, thanks a lot for sharing! I was interested, To how much did you limit the network?
Just try to simulate 10Gb network. I limited client nic to 10Gb, 4 osd nic to 2.5Gb each.
Hi @JayjeetAtGithub ,
I'm evaluating rados-parquet performance. I tested on 4 virtual machines. It's not good for benchmarking, but I did find some issues I cannot explain. Would like to hear your comments.
Host server is installed with 64 Xeon(R) Gold 5218 cores and 128G ram. I launched 4 VMs with 8 core and 16G ram each. One VM as client, and another 3 VMs as Ceph OSD+Monitor.
Running bench.py [1], I see rados-parquet performance is much worse than parquet for all selection ratios. But I didn't see any resource bottleneck on the test VMs and host. CPU, memory, disk, network are all of low usage. It's a bit strange why rados-paquet is running slow in this test setup. What's the possible tuning strategy I can try?
Test log rados-parquet
parquet
[1] https://github.com/JayjeetAtGithub/skyhook-nsdi/blob/master/bench.py
Yibo