Closed tomcombriat closed 7 months ago
Memory usage change @ d91bafa2e47be3d01802a75eac5b016ddaddcb04
Board | flash | % | RAM for global variables | % |
---|---|---|---|---|
STMicroelectronics:stm32:GenF1:pnum=BLUEPILL_F103C8 |
0 - 0 | 0.0 - 0.0 | 0 - 0 | 0.0 - 0.0 |
arduino:avr:uno |
0 - 0 | 0.0 - 0.0 | 0 - 0 | 0.0 - 0.0 |
I actually made a mistake in my previous comment. Here is a correct version. I am hesitating a bit between:
template<int8_t _NF=FixMathPrivate::FM_min(NI*2+NF-1,64-NF)>
constexpr UFix<NF,_NF> invAccurate() const
{
static_assert(NF+_NF+1<=64, "The accurate inverse cannot be computed because the asked precision is too great. Reduce the number of bits.");
return UFix<NF,_NF>(UFix<NF,_NF+1>::msbone()/internal_value,true);
}
template<int8_t _NF=FixMathPrivate::FM_min(NI*2+NF,64-NF)>
constexpr UFix<NF,_NF-1> invAccurateb() const
{
static_assert(NF+_NF<=64, "The accurate inverse cannot be computed because the asked precision is too great. Reduce the number of bits.");
return UFix<NF,_NF-1>(UFix<NF,_NF>::msbone()/internal_value,true);
}
for the default template parameter, they behave the same, but they don't when using invAccurate<_NF>
. The first one is more logical as it returns a number with _NF
as precision, but it uses something one bit bigger for internal calculation: UFix<NF,_NF+1>::msbone()
. The second one returns a number with _NF-1
as precision, which can be confusing, but do not use this extra bit for calculation. Put into an example:
UFix<0,16> a = 0.5;
auto b = a.invAccurate<16>(); -> returns a UFix<16,16>, but uses a 16+16+1 = 33 bits internally (which is not super efficient)
auto c = a.invAccurateb<16>(); -> returns a UFix<16,15> which is less logical but do not excedd 32 internally
If you do not have any strong opinion about it I think I'll go for the first one, with a clear enough note in the documentation.
Memory usage change @ 09e74867511320bf90c682f8f932b5f5d1566484
Board | flash | % | RAM for global variables | % |
---|---|---|---|---|
STMicroelectronics:stm32:GenF1:pnum=BLUEPILL_F103C8 |
0 - 0 | 0.0 - 0.0 | 0 - 0 | 0.0 - 0.0 |
arduino:avr:uno |
0 - 0 | 0.0 - 0.0 | 0 - 0 | 0.0 - 0.0 |
Memory usage change @ ef101449343e409534b6392806563212b723bdc2
Board | flash | % | RAM for global variables | % |
---|---|---|---|---|
STMicroelectronics:stm32:GenF1:pnum=BLUEPILL_F103C8 |
0 - 0 | 0.0 - 0.0 | 0 - 0 | 0.0 - 0.0 |
arduino:avr:uno |
0 - 0 | 0.0 - 0.0 | 0 - 0 | 0.0 - 0.0 |
Done! If you have some comments (or an opinino on the previous comment) let me know! Otherwise I'll merge in a few days…
The static_assert with invAccurate
does work now ;)
Memory usage change @ 3a4ebd7ab33f23bc55f534a2be8d010f892e4d5e
Board | flash | % | RAM for global variables | % |
---|---|---|---|---|
STMicroelectronics:stm32:GenF1:pnum=BLUEPILL_F103C8 |
0 - 0 | 0.0 - 0.0 | 0 - 0 | 0.0 - 0.0 |
arduino:avr:uno |
0 - 0 | 0.0 - 0.0 | 0 - 0 | 0.0 - 0.0 |
Memory usage change @ 200475d355fe067f7697969912d30a849b60ef65
Board | flash | % | RAM for global variables | % |
---|---|---|---|---|
STMicroelectronics:stm32:GenF1:pnum=BLUEPILL_F103C8 |
0 - 0 | 0.0 - 0.0 | 0 - 0 | 0.0 - 0.0 |
arduino:avr:uno |
0 - 0 | 0.0 - 0.0 | 0 - 0 | 0.0 - 0.0 |
If you do not have any strong opinion about it I think I'll go for the first one, with a clear enough note in the documentation.
I'd lean towards the first one, too.
For now a draft PR, to allow space for discussion.
First of, I optimized a bit the existing inverse functions, giving them the possibility to work on types one bits bigger than before.
invAccurate
has been renamedinvFull
.Need to discuss
I implemented a new
invAccurate
, which returns exact values when possible. This calculation is done by dividing a shifted 1 by the value (and not a full mask of ones). Now, this require the call to a type which is normally one bit bigger than the original. To allow computation of inverses up toinvAccurate
) the native return type is notUFix<NF,2*NI+NF>
butUFix<NF,2*NI+NF**-1**>
. In the same way, usinginvAccurate<_NF>
** returns aUFix<NF,_NF-1>
. I think this can be confusing, but this avoids to ask for a type bigger than 64 bits when inverting 32bits types (which is a good thing I think) and returns the native precision of the operation.UFix<NF,2*NI+NF>
which is more usual, knowing that the last bit will always be 0. This is less confusing but at the same time this promotes one bit too big (and we know it) which I would prefer to avoid.Opinion needed:
Edit:
Note that only
UFix
is done, I will do the other one once we are settled on a solution.** I made
invAccurate
a template with a default argument.