Feature/xflow webcl integration

With this pull request the WebCL API will be integrated into the Xflow architecture. Please note that there might be some defiencies/bugs in the code; some code refactoring and update tasks have been scheduled for the last sprint and I will create a separate pull request for them later. However, the code in this pull request is fully functional, but should be considered somewhat "experimental" like the WebCL API is.

I have not yet created tests cases for the features introduced in this pull request, but I will include the tests in the upcoming "update pull request". Nevertheless, all the old XML3D/Xflow tests were passed with and without the WebCL platform, so they should not be any breaking changes.

I will also combine the features introduced in the "XML3D data adapter enhancements" with the features in this pull request in the future "update pull request".

Requirements

These are basically the same requirements as for the WebCL API

Nokia's WebCL plugin for FireFox 25. (The current WebCL plugin for FireFox 26 is not yet supported by the WebCL API! You can get the older plugin from Nokia's git repo: https://github.com/toaarnio/webcl-firefox)
FireFox 25
Check out the other WebCL related requirements in: https://github.com/xml3d/xml3d.js/wiki/WebCL-API
Included changes/features
WebCL API enhancements
WebCL API integrated into the Xflow architecture
Specific Xflow WebCL operators
Fallback to the JS platform if the WebCL platform is not available
Rough performance measurements
Operators used in the measurements: A heavy blur operator and a very light thresholding operator.
Input data: 2048x1280px color image.

On the JS platform the execution time of the evaluate function was measured and on the WebCL platform the execution time of the main WebCL application program (corresponds to the JS evaluate function) and the kernel program combined was measured. Both JS and WebCL versions of the operators implement the same algorithm.

Processing times (on FireFox 25)

JS platform Blur: 1483 ms (avg.), Thresholding: 11.45 (avg.)

WebCL platform Blur: 5.74 ms (avg.), Thresholding: 4.42ms (avg.)

Xflow WebCL operators

The Xflow WebCL operator is designed in a way that allows a developer to focus purely on the core WebCL kernel programming logic. Developers can write their WebCL kernel code in the "evaluate" attribute of the operator, like shown in the example below. However, no kernel function headers or input/output parameters need to be defined in the code as they are initialised and defined automatically under the hood. The idea is to allow creation of one atomic kernel per Xflow operator that implements a specific functionality. The design simplifies the usage of the WebCL operators and the management of the WebCL processes. The downside of this approach is that it introduces some restrictions.

Below is an example of registering a WebCL Xflow operator. This operator applies a blur effect on the input "image" texture parameter and outputs the processed "result" texture.

 Xflow.registerOperator("xflow.blurImage", {
   outputs: [
       {type: 'texture', name: 'result', sizeof: 'image'}
   ],
   params: [
       {type: 'texture', source: 'image'},
       {type: 'int', source: 'blurSize'}
   ],
   platform: Xflow.PLATFORM.CL,
   evaluate: [
       "const float m[9] = {0.05f, 0.09f, 0.12f, 0.15f, 0.16f, 0.15f, 0.12f, 0.09f, 0.05f};",
       "float3 sum = {0.0f, 0.0f, 0.0f};",
       "uchar3 resultSum;",
       "int currentCoord;",
       "for(int j = 0; j < 9; j++) {",
       "currentCoord = convert_int(image_i - (4-j)*blurSize);",
       "if(currentCoord >= 0 || currentCoord <= image_width * image_height) {",
       "sum.x += convert_float_rte(image[currentCoord].x) * m[j];",
       "sum.y += convert_float_rte(image[currentCoord].y) * m[j];",
       "sum.z += convert_float_rte(image[currentCoord].z) * m[j];",
       "}",
       "}",
       "resultSum = convert_uchar3_rte(sum);",
       "result[image_i] = (uchar4)(resultSum.x, resultSum.y, resultSum.z, 255);",
   ]});

The underlying code processes "outputs" and "params" of the Xflow operator and allows them to be directly used in the WebCL kernel code. As seen in the example above, the input parameter "image" can be directly used in the kernel code. An iterator for the first input parameter is also automatically generated and it can be safely used in the code. For the "image" param the iterator variable is named as "image_i". Also, some helper variables such as "image_height" and "image_width" are generated and likewise, they can be used in the evaluate code. All other input parameter types have a "length" helper variable e.g. "parameterName_length" that determines the length of the input buffer.

Additionally, all WebCL application code needed for executing the WebCL kernel code (such as passing WebCL kernel arguments to the WebCL program and defining proper WebCl workgroup sizes) is generated automatically. Thus, developers need no deep knowledge of the WebCL programming and basic programming skills are enough to produce kernel code for simple WebCL Xflow operators.

Below is an example of a very simple WebCL Xflow operator. This operator is used for grayscaling an input texture. Only three lines of kernel code is required.

 Xflow.registerOperator("xflow.desaturateImage", {
       outputs: [
           {type: 'texture', name: 'result', sizeof: 'image'}
       ],
       params: [
           {type: 'texture', source: 'image'}
       ],
       platform: Xflow.PLATFORM.CL,
       evaluate: [
           "uchar4 color = image[image_i];",
           "uchar lum = (uchar)(0.30f * color.x + 0.59f * color.y + 0.11f * color.z);",
           "result[image_i] = (uchar4)(lum, lum, lum, 255);"
       ]
   });

xml3d / xml3d.js