Open KOVI89alipes opened 3 years ago
I think I understand the problem. However, I don't think Vivado would support the suggested fix. I believe if constexpr
is a C++17 feature, while Vivado HLS supports only (some of) C++11. I think Vitis HLS brings support up to C++14.
Trying to compile an hls4ml project using the proposed fix with Vivado 2019.2 (GCC 4.6.3 for csim, clang 3.1) on Ubuntu 20.04:
../../../../firmware/nnet_utils/nnet_common.h: In function ‘T nnet::reduce(T*, Op)’:
../../../../firmware/nnet_utils/nnet_common.h:47:5: error: expected ‘(’ before ‘constexpr’
if constexpr (N == 1){
^~~~~~~~~
Does it work on Windows in Vivado HLS?
If you wish to keep c++14 support you can solve the problem via template specializations.
Something like
template <int N>
struct Reduce
{
static constexpr int leftN = pow2(floorlog2(N - 1)) > 0 ? pow2(floorlog2(N - 1)) : 0;
static constexpr int rightN = N - leftN > 0 ? N - leftN : 0;
template <class T, class Op>
static T reduce(T *x, Op op)
{
T left[leftN];
T right[rightN];
#pragma HLS array_partition variable = left complete
#pragma HLS array_partition variable = right complete
ReduceLeft:
for (int i = 0; i < leftN; i++)
{
left[i] = x[i];
}
ReduceRight:
for (int i = 0; i < rightN; i++)
{
right[i] = x[i + leftN];
}
return op(Reduce<leftN>::reduce<T, Op>(left, op), Reduce<rightN>::reduce<T, Op>(right, op));
}
};
template <>
struct Reduce<2>
{
template <class T, class Op>
static T reduce(T *x, Op op)
{
return op(x[0], x[1]);
}
};
template <>
struct Reduce<1>
{
template <class T, class Op>
static T reduce(T *x, Op op)
{
return x[0];
}
};
template <class T, int N, class Op>
T reduce(T *x, Op op)
{
return Reduce<N>::reduce(x, op);
}
I think that should work. Perhaps someone can try this and make a PR?
Just to make it clear, is there a reason why we copy the contents of x into left and right? I'm unsure what the HLS pragmas do but a simpler and more effective operation would be:
template<class T, int N, class Op>
T reduce(const T* x, Op op)
{
static constexpr int leftN = pow2(floorlog2(N - 1)) > 0 ? pow2(floorlog2(N - 1)) : 0;
static constexpr int rightN = N - leftN > 0 ? N - leftN : 0;
if (N == 1){
x[0];
}
else if (N == 2){
return op(x[0],x[1]);
}
return op(reduce<T,leftN,Op>(x, op), reduce<T,rightN,Op>(x+leftN, op));
}
If that is possible then I guess we dont need all the template specialization etc.
I'll gladly do the PR in any case
I think that could do it. The purpose of the reduce
function is to force Vivado to use a balanced tree for some binary operator. We use it to sum an array, or find the element with the maximum value for example. So the pragmas are just part of making sure we get the correct structure at the end (see image).
To see that it's worked as intended I would check that the latency and II of a model that uses it doesn't change after C Synthesis. This could be done simply with some small standalone (outside of hls4ml) test that just calls the reduce
function. I'd also want to see that it's functionally correct, again that could be done in a standalone testbench to check that when used with Op_add
, it gives the same result as summing the elements in a loop.
One of the example models using Softmax
would also be a good probe, for example the KERAS_3layer
model in the example_models
directory (Github submodule). To run the tests offline one can execute bash hls4ml-keras-test.sh
in the test/
directory. You can skip some models by commenting out lines in keras-models.txt
. I think we only need to look at KERAS_3layer
.
Also Pooling with io_stream
needs to be checked.
In C++ it is illegal to declare an array of zero length. https://stackoverflow.com/a/6180200/14582299
But it's allowed in the GNU C extension https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
Zero size array appears in the
reduce
template functionhttps://github.com/fastmachinelearning/hls4ml/blob/c3509cfeb5f80054e01dead612392a9411ebfbe2/hls4ml/templates/vivado/nnet_utils/nnet_common.h#L47
And it gives an error in MSVC compilation
Possible fix - replace
if
withif constexpr