Open tonyzhenyuxu opened 7 years ago
Typically FPGAs work slightly differently from discrete GPUs, in that the programming time is very long, hours.
For a discrete GPU, such as AMD, or NVIDIA, the workflow for OpenCL looks something like:
For a discrete GPU, steps 1 to 3 take ~seconds. Step 4 takes ~seconds, or less. Step 5 takes as long as it takes. minutes/hours/days/weeks, depending on what you're doing/training.
For an FPGA, step 4, takes significantly longer. Hours instead of seconds. So, the workflow would be quite different. The compilation of the OpenCL has to happen offline essentially, rather than at runtime.
It's probably not a massively blocking change, but it would need rethinking somewhat how the program runs. For example, EasyCL currently assumes that hte OpenCL will be compiled at runtime. You'd need to partition EasyCL into two parts:
(But note that I have zero experience with FPGAs, so I dont really know. You should check how compilation on an FPGA works for yourself)
you are absolutely right. I have to compile kernel code (openCL) offline. It takes from 10 to 15 hours usually. I have been reading the source code and comparing with Altera OpenCL examples. as you said, I don't think there is a massive code change, but, indeed, I need to partition code.
Thanks.
On Sat, Dec 3, 2016 at 2:08 AM, Hugh Perkins notifications@github.com wrote:
(But note that I have zero experience with FPGAs, so I dont really know. You should check how compilation on an FPGA works for yourself)
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/hughperkins/EasyCL/issues/21#issuecomment-264630072, or mute the thread https://github.com/notifications/unsubscribe-auth/AXMfnuVACxbJsKMX6aXH5EL0t0Hgrvtuks5rET-dgaJpZM4LC7t3 .
Cool :-)
I am digging more on source code and try to build AlexNet. but I realize there is no stride (or stride = 1) in your implementation. is it true? for example, for AlexNet first layer, I have filter size = 11 x 11, feature map=96 (I guess in here, we can numFilters), it will have stride = 4, so we have 55 x 55 x 96 output. I don't know how we can do that in DeepCL?
Thanks.
On Mon, Dec 5, 2016 at 1:54 PM, Hugh Perkins notifications@github.com wrote:
Cool :-)
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/hughperkins/EasyCL/issues/21#issuecomment-264990017, or mute the thread https://github.com/notifications/unsubscribe-auth/AXMfngq3paYPhSYnZ74_r5bdD_XvIaWTks5rFIgWgaJpZM4LC7t3 .
Can you help me to understand the following code: in the LayerDimensions.cpp file and deriveOthers function
we have the following line (which puzzles me). this->outputSize = padZeros? (filterSize % 2 == 0? inputSize/(skip+1) + 1 : inputSize/(skip +1)) : (inputSize - filterSize)/ (skip+1) + 1;
I am wondering if (filterSize % 2 == 0? inputSize/(skip+1) + 1 : inputSize/(skip +1)) : could be: (filterSize % 2 == 0? (inputSize-filterSize)/(skip+1) + 1 : (inputSize-filterSize)/(skip +1)) :
It seems to me skip+1 = stride, is it right? There is not much explanation on dimension in the code. would you mind to spare a few minutes on this.
Appreciated.
-T
On Thu, Dec 8, 2016 at 4:16 PM, tzxu . tony.z.xu@gmail.com wrote:
I am digging more on source code and try to build AlexNet. but I realize there is no stride (or stride = 1) in your implementation. is it true? for example, for AlexNet first layer, I have filter size = 11 x 11, feature map=96 (I guess in here, we can numFilters), it will have stride = 4, so we have 55 x 55 x 96 output. I don't know how we can do that in DeepCL?
Thanks.
On Mon, Dec 5, 2016 at 1:54 PM, Hugh Perkins notifications@github.com wrote:
Cool :-)
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/hughperkins/EasyCL/issues/21#issuecomment-264990017, or mute the thread https://github.com/notifications/unsubscribe-auth/AXMfngq3paYPhSYnZ74_r5bdD_XvIaWTks5rFIgWgaJpZM4LC7t3 .
Yes, skip + 1 is stride. 'Skip' is from a relatively old paper. 'Stride' is the common notation nowadays.
On 10 December 2016 01:28:39 CET, tonyzhenyuxu notifications@github.com wrote:
Can you help me to understand the following code: in the LayerDimensions.cpp file and deriveOthers function
we have the following line (which puzzles me). this->outputSize = padZeros? (filterSize % 2 == 0? inputSize/(skip+1) + 1 : inputSize/(skip +1)) : (inputSize - filterSize)/ (skip+1) + 1;
I am wondering if (filterSize % 2 == 0? inputSize/(skip+1) + 1 : inputSize/(skip +1)) : could be: (filterSize % 2 == 0? (inputSize-filterSize)/(skip+1) + 1 : (inputSize-filterSize)/(skip +1)) :
It seems to me skip+1 = stride, is it right? There is not much explanation on dimension in the code. would you mind to spare a few minutes on this.
Appreciated.
-T
On Thu, Dec 8, 2016 at 4:16 PM, tzxu . tony.z.xu@gmail.com wrote:
I am digging more on source code and try to build AlexNet. but I realize there is no stride (or stride = 1) in your implementation. is it true? for example, for AlexNet first layer, I have filter size = 11 x 11, feature map=96 (I guess in here, we can numFilters), it will have stride = 4, so we have 55 x 55 x 96 output. I don't know how we can do that in DeepCL?
Thanks.
On Mon, Dec 5, 2016 at 1:54 PM, Hugh Perkins notifications@github.com wrote:
Cool :-)
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub
https://github.com/hughperkins/EasyCL/issues/21#issuecomment-264990017, or mute the thread
-- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/hughperkins/EasyCL/issues/21#issuecomment-266160565
-- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Hi, I also want to improve this code to support fpga.
I use tesrasic c5p, which chip is altera cyclone V
I could run some simple demo, such as hello world
, vector_add
, but when i call clinfo
, it reports
root@up2:~# clinfo
I: ICD loader reports no usable platforms
Is there some hope? Thanks very much
Fpgas need to be compiled offline, have the kernels burned onto the fpga. This can take several hours. Then, once they are burned in, you can run them.
You would need to modify your code to support two stages like this. And easycl too.
On Fri, Jun 15, 2018, 09:48 李昊 notifications@github.com wrote:
Hi, I also want to improve this code to support fpga. I use tesrasic c5p, which chip is altera cyclone V I could run some simple demo, such as hello world, vector_add, but when i call clinfo, it reports
root@up2:~# clinfo I: ICD loader reports no usable platforms
Is there some hope? Thanks very much
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/hughperkins/EasyCL/issues/21#issuecomment-397625554, or mute the thread https://github.com/notifications/unsubscribe-auth/AAHiqJ0iHEwD79Os75orj9lzxOoQY40Tks5t87sxgaJpZM4LC7t3 .
@lihao2333
oh. re-reading, now I have access to a web browser, not just replying to an email; ok, right, you would need to find an opencl driver, and icd registration, for your fpga. You could for example ask the customer support for your fpga, or search in their forums perhaps.
Would you please give me your comments on supporting OpenCL running on FPGA device instead of GPU such as Altera Arria 10? Thanks.