Open hadoopch opened 2 years ago
Here some additional infos:
OS: CentOS Linux release 7.9.2009 local HDD: SEAGATE ST2000NX0273 resp. SEAGATE ST1800MM0159 fio-Version: fio-3.7 iperf-version: iperf 3.1.7
I used the standard thresholds for the test
Hi @hadoopch ,
I hace run successfully these test in workers with 7.2k rpm and masters with 10k and 15k rpm HDD. You need to tune the disks, you have and example in how to do this in HPE Reference Architecture for Cloudera Data Platform and check further info in Performance Tuning on Linux — Disk I/O
Kind regards
Hi all,
in my opinion some of the test makes no sense, especially with the default value of osqd (=iodepth)
Analyzing disk latency with a high iodepth makes no sense at all.
A high iodepth is used in the async test and at the same time latency is checked against a threshold for these async tests.
If you go to a maximum of IOPS and Bandwidth the measured latency does not reflect the disk latency.
You can tune fio iodepth via osqd Parameter in the test-config.
See also, e.g.
Hi, Have you tried to (workers) set nr_requests&fifo_batch to [12-16] and adjust osqd?
KR
Hi Gerado,
the storage device acceptance tests was especially designed to all none local storage solutions. I made my tests on Huawei Cloud resp. Open Telekom Cloud using virtio block devices.
Have you ever tried to run the test on nodes with none local storage -e.g. vmware?
For me the test is doubtful:
Osqd is not mentioned at all in the manual. Furthermore in my opinion it makes no sense to measure disk latency with the same parameters that are used to measure IOPS and bandwidth.
Uli
When you run your CDP Private Cloud on none local storage Cloudera recommends doing its Storage Acceptance Test
Cloudera Enterprise Storage Device Acceptance Criteria Guide
In this document there is a link to this github page.
What i have done:
I installed Cloudera CDP private Cloud on our Cloud Platform (Openstack) and made various Test (Microbenchmark Tests) with local and none local Storage. So far I have done the tests only for the worker nodes.
Even with local storage some of the tests failed. Finally I made the tests on bare metal servers with local disks. Also here some of the tests failed
1) bare Metal and Cloud with local disks
Seqread and seqwrite succeeded. Randrw failed
2) Failed on VM with None Lokal disks
All Latency tests fail
Because of the fact that tests also failed on bare metal with local disks I have some doubt about the acceptance tests. Who has ever run successfully these tests ?