mpicbg-scicomp / gearshifft

Benchmark Suite for Heterogenuous FFT Implementations
Apache License 2.0
34 stars 9 forks source link

Adding rocfft #126

Closed psteinb closed 5 years ago

psteinb commented 6 years ago

even though #124

tdd11235813 commented 6 years ago

ok, just checked their readme to see, that the rocfft uses hip under the hood, so your code might run on both platforms by either using nvcc or hcc. The hip calls should be error-handled like in CUDA. rocFFT itself has a hipFFT wrapper. That helps when you come from cuFFT code.

rocFFT specs (copied)

psteinb commented 5 years ago

56d402c builds a rocm 1.9.2 based rocfft runner, but it throws a runtime exception which I have to investigate

psteinb commented 5 years ago

ready to rock and roll! I just produced this with rocfft in rocm 1.9.2 on my fiji nano. A bit of an xmas present to @tdd11235813 and gearshifft. ;)


$ cat result.csv 
; "Fiji [Radeon R9 FURY / NANO Series]", "CC", 3.0, "PCI Bus ID", 3, "PCI Device ID", 0, "Multiprocessors", 64, "Memory [MiB]", 4096, "MemoryFree [MiB]", 3840, "HostMemory [MiB]", 64190, "MemClock [MHz]", 500, "GPUClock [MHz]", 1000, "rocfft", 806,"NumberWarmups",2,"NumberWarmRuns",10,"NumberTotalRuns",12,"ErrorBound",1e-05,"CurrentTime",1545302436,"CurrentTimeLocal","Thu Dec 20 11:40:36 2018","Hostname","islay.mpi-cbg.de","gearshifft","0.3.0"
; "Time_ContextCreate [ms]", 0.139483
; "Time_ContextDestroy [ms]", 0.084825
"library","inplace","complex","precision","dim","kind","nx","ny","nz","run","id","success","Time_Allocation [ms]","Time_PlanInitFwd [ms]","Time_PlanInitInv [ms]","Time_Upload [ms]","Time_FFT [ms]","Time_iFFT [ms]","Time_Download [ms]","Time_PlanDestroy [ms]","Time_Total [ms]","Size_DeviceBuffer [bytes]","Size_DevicePlan [bytes]","Size_DeviceTransfer [bytes]","Error_StandardDeviation","Error_Mismatches"
"RocFFT","Inplace","Real","float",1,"oddshape",265841,0,0,0,0,"Warmup",0.00462,1.514999,0.00344,0.32063698769,1.8060619831,1.8831809759,0.26831799746,0.03605,5.922963,1063368,0,1063364,2.1931190051e-07,0
"RocFFT","Inplace","Real","float",1,"oddshape",265841,0,0,1,0,"Warmup",0.004873,1.479619,0.005321,0.32303699851,1.8236620426,1.8718210459,0.30271700025,0.033771,5.991953,1063368,0,1063364,2.1931190051e-07,0
"RocFFT","Inplace","Real","float",1,"oddshape",265841,0,0,2,0,"Success",0.00452,1.536595,1.57312,0.28335699439,1.8033419847,1.8231819868,0.32591700554,0.034452,7.506252,1063368,0,1063364,2.1931190051e-07,0
"RocFFT","Inplace","Real","float",1,"oddshape",265841,0,0,3,0,"Success",0.003991,1.534916,1.323269,0.2974370122,1.8092620373,1.7956620455,0.32255700231,0.036562,7.204425,1063368,0,1063364,2.1931190051e-07,0
"RocFFT","Inplace","Real","float",1,"oddshape",265841,0,0,4,0,"Success",0.004337,1.541563,0.003636,0.3150370121,1.7907019854,1.8723009825,0.28927698731,0.032172,5.996714,1063368,0,1063364,2.1931190051e-07,0
"RocFFT","Inplace","Real","float",1,"oddshape",265841,0,0,5,0,"Success",0.0044,1.528481,0.005542,0.29535698891,1.8105419874,1.8059020042,0.32575699687,0.035631,5.896988,1063368,0,1063364,2.1931190051e-07,0
"RocFFT","Inplace","Real","float",1,"oddshape",265841,0,0,6,0,"Success",0.004275,1.542887,0.005118,0.30975699425,1.8286219835,1.8831809759,0.28495699167,0.031162,6.033764,1063368,0,1063364,2.1931190051e-07,0
"RocFFT","Inplace","Real","float",1,"oddshape",265841,0,0,7,0,"Success",0.004367,1.515764,0.003337,0.30735701323,1.826382041,1.866541028,0.34447699785,0.029631,6.047815,1063368,0,1063364,2.1931190051e-07,0
"RocFFT","Inplace","Real","float",1,"oddshape",265841,0,0,8,0,"Success",0.003804,1.529423,0.003619,0.27839699388,1.8143819571,1.8556619883,0.32111701369,0.035784,5.93105,1063368,0,1063364,2.1931190051e-07,0
"RocFFT","Inplace","Real","float",1,"oddshape",265841,0,0,9,0,"Success",0.004242,1.568875,0.003564,0.3084770143,1.8206219673,1.8628610373,0.34495699406,0.031359,6.089821,1063368,0,1063364,2.1931190051e-07,0
"RocFFT","Inplace","Real","float",1,"oddshape",265841,0,0,10,0,"Success",0.004232,1.529613,0.003135,0.27663698792,1.8127820492,1.8612610102,0.31919699907,0.035513,5.930612,1063368,0,1063364,2.1931190051e-07,0
"RocFFT","Inplace","Real","float",1,"oddshape",265841,0,0,11,0,"Success",0.004447,1.681499,0.003732,0.31951698661,1.8051019907,1.8540619612,0.31711700559,0.034131,6.11645,1063368,0,1063364,2.1931190051e-07,0```
tdd11235813 commented 5 years ago

YES awesome, thanks for the update :) I would start to integrate this in PR #124 after it is merged as discussed offline, so just let me know, if you want me to merge this. Thanks again and Happy Holidays! :)

psteinb commented 5 years ago

Happy New Year. Yes, please merge this PR and then we'll go for the super builds. I just tested with rocm 2.0 and it works flawlessly.

psteinb commented 5 years ago

thanks for the thorough review @tdd11235813, I hope I adressed all your concerns

tdd11235813 commented 5 years ago

awesome, thank you so much for your work!