xmos / xmath_walkthrough

A tutorial on different approaches to high performance FIR and FFT using xcore.ai
Other
2 stars 2 forks source link

Feedback from Tim Mace #1

Closed astewart-xmos closed 1 year ago

astewart-xmos commented 1 year ago
astewart-xmos commented 1 year ago

Page 9 - In the table, the text in the second column extends into the third - need to adjust the widths of columns 2 and 3

This was an absolute nightmare to fix! In the end the end the only solution I could find was to shorten the descriptions. I spent about 2 hours trying to fix this.

Page 10 - First paragraph - possibly not too important but if we are going to compare the accuracy between the BFP and FP at any stage it would be better to have taps with more of a dynamic range between them

I agree with this, and had originally intended to make it do something more interesting after I basically had everything written and working. But that took a lot longer than anticipated and unfortunately it will be a good bit of work to make this change now.

If we consider this necessary I can go back and do it. Otherwise, if there's ever occasion to release a "version 2" (or whatever), this is one of the updates I would suggest in that.

Page 10 - It doesn't actually make a difference to this example because of the simple filter transfer function but putting the scaling into the coefficients increases the noise floor proportionately. Ideally we should scale after the dot product, particularly since the xcore VPU can capture the headroom. Is this just used in this example or do we scale the coefficients for all filters with this library?

The library doesn't require this to be one way or the other. The filter_fir_s32() implementation uses the (Q2.30) coefficients provided by the user as-is, and the final step to produce the output sample is to apply an arithmetic right-shift to the accumulators. The user is only required to ensure that the coefficients are scaled such that the accumulators won't overflow.

I believe the floating-point --> fixed-point filter conversion python script provided with the library chooses a coefficient scaling where it tries to leave minimal headroom in the filter coefficients. (Though it will only scale them by powers of 2).

But otherwise, it is left up to the user to make sure the coefficient scaling and output shift meet their needs.

There seem to be a number of optimizations that could be made to the code although they all have minor performance implications

I agree there are likely many such optimizations in various places. In some cases I have consciously chosen to prefer clarity of purpose over potentially minor optimizations. For example, none of the loops are unrolled anywhere.

I'm sure in many other cases I just didn't notice that an optimization was possible. If there are any specific optimizations that you think would make an appreciable difference, I'm happy to try them out.

Page 25 - 'The main reason for this speed-up is that the C compiler does not, by default, emit dual-issue implementations of C functions' - Why not? Is there some fundamental reason why the compiler can't do this or is it just on our 'to do' list?

I'm not sure. I believe there are compiler flags or #pragmas that will have the compiler generate dual-issue code. There may also be some cases I'm not aware of where it actually does emit dual-issue code automatically.

My guess (speculation) for why this doesn't generally happen by default is because we're typically more constrained by memory than speed, and emitting dual-issue code will always make the program size larger, particularly if it isn't well optimized.

Page 23 (and others) - not clear what the scale is for the output (guessable) and the sample diff - it would be good to indicate the SNR for comparison

I'm not sure what you mean with respect to the scale not being clear (the Y-axes all show a scale).

Page 34 - There are two formats for the 'Part ##' terms; black bold and blue bold italic - are these supposed to be the same or is there supposed to be a differentiation?

The ones which are blue and in italics are because they are links to those sections in the document. The others are not links because it didn't seem appropriate to make every single one a link.

My operating convention was:

astewart-xmos commented 1 year ago

Closing issue as stale.