“Dwarf bench” is a collection of patterns that attempt to capture performance characteristics of analytical queries. The idea is to extend the taxonomy of computational patterns defined in the article “The Landscape of Parallel Computing Research” published in 2006 to data analytics in heterogeneous environments. Implementing basic structures and algorithms once for multiple devices strives to find a balance between performance and specific capabilities usage, and implementation effort. We chose platform-agnostic tools to express our language of patterns (OpenCL, SYCL).
#include <bench.hpp>
#include <iostream>
int main() {
std::vector<DwarfBench::DeviceType> devices = {
DwarfBench::DeviceType::CPU,
DwarfBench::DeviceType::GPU,
};
std::vector<DwarfBench::Dwarf> dwarfs = {
DwarfBench::Dwarf::Join,
DwarfBench::Dwarf::Sort,
DwarfBench::Dwarf::Scan,
DwarfBench::Dwarf::GroupBy
};
DwarfBench::DwarfBench db;
for (DwarfBench::Dwarf dwarf: dwarfs) {
for (DwarfBench::DeviceType device: devices) {
DwarfBench::RunConfig rc = {
.device = device,
.inputSize = 1024,
.iterations = 10,
.dwarf = dwarf,
};
auto results = db.makeMeasurements(rc);
for (auto &result : results) {
std::cout << dwarf << ' ' << device << " RESULT: " << result.dataSize << ' ' << result.microseconds
<< std::endl;
}
}
}
}
Check the lists of benchmarks available using dwarf list
Launch one of the kernel using for example Radix kernel:
./dwarf_bench Radix --device=cpu --input_size=25600 262144 524288 --report_path="report_radix_CPU.csv" --iterations=9
Change to GPU device using --device=gpu
sudo apt install intel-oneapi-runtime-opencl
)nproc