Open scarrazza opened 5 years ago
Not yet, I have to finalize the vegas opencl loop in C++. I am working on it, so probably in 1-2h the code will be ready for compilation.
The problem about unification is that I am sure the FPGA kernel will look much more different than the GPU/CPU.
We should make it look similar enough though. I haven't looked at opencl-for-xilink but if it is very different then what is the point of having opencl support...
it will look similar, but in order to provide fast results we have to add tons of #pragmas and specific attributes for barrier and pipeline operations. (things that for sure we can include in a single cl file with #ifdef FPGA...)
Software emulation seems to work. I am now testing hardware emulation and real hardware, I will push the bitstreams as soon the compilation terminates.
OK, all three modes are working, the current kernel on real fpga hardware takes almost the same amount of time of a 36 threads CPU on dom. Compilation time requires 2h for hardware.
Here the profiling for sw_emulation: profiling_cpu.pdf
Next step: play with the kernel and the special opencl instructions described in
We have to be careful with the binnaries, having one set for testing is ok but I think it is better if we share it in dom in some folder, because the repository is already over 60 MB...
Yeah, we can drop them from here.
Asking for 100 events makes the hw_emu usable. Here the report: profiling_hw_emu.pdf
There is a nice table which helps in checking if we are doing things properly (and we are not).
I think the opencl kernel can be greatly improved in a way that is much softer in memory (and also less useful for HEP, but we don't really care at this point :P) I'll continue playing with the python/C version until I am happy with it.
Is this able to run on the FPGA already?
I might try tomorrow unifying the python and C version so that they use the same OpenCL kernels so that FPGA, C and python are always synchronized.