MikeLankamp / fpm

C++ header-only fixed-point math library
https://mikelankamp.github.io/fpm
MIT License
672 stars 85 forks source link

Support for Shift Operators #42

Closed dismine closed 2 years ago

dismine commented 2 years ago

Hi.

Thank you for this amazing class. I want to leave my feedback.

First. There are no shift operators.

template <typename I, typename std::enable_if<std::is_integral<I>::value>::type* = nullptr>
inline fixed& operator>>=(I y) noexcept
{
    m_value >>= y;
    return *this;
}

template <typename I, typename std::enable_if<std::is_integral<I>::value>::type* = nullptr>
inline fixed& operator<<=(I y) noexcept
{
    m_value <<= y;
    return *this;
}

template <typename B, typename I, unsigned int F, typename T, typename std::enable_if<std::is_integral<T>::value>::type* = nullptr>
constexpr inline fixed<B, I, F> operator>>(const fixed<B, I, F>& x, T y) noexcept
{
    return fixed<B, I, F>(x) >>= y;
}

template <typename B, typename I, unsigned int F, typename T, typename std::enable_if<std::is_integral<T>::value>::type* = nullptr>
constexpr inline fixed<B, I, F> operator<<(const fixed<B, I, F>& x, T y) noexcept
{
    return fixed<B, I, F>(x) <<= y;
}

I spent a whole day to understand why fixed doesn't work in my case. :thinking: The problem was in this string.

const int count = static_cast<int>(sweep >> (16 - k));

When you use just std::int32_t for type, this will work. But in case of fixed this doesn't return internal value. Because I save inside fixed a floating-point value, this approach is incorrect. It confuses. I am not familiar with fixed-point types. But I do not undersatnd purpose of conversion to integrals.

Correct code in my case.

const int count = (sweep >> (16 - k)).raw_value();
MikeLankamp commented 2 years ago

Hi @dismine, fpm::fixed emulates floating-point types. Shift operators don't make sense for those, and they don't make sense for fpm::fixed either.

From your code snippet

const int count = static_cast<int>(sweep >> (16 - k));

it looks like you want to get the integral part of the fixed-point number. You can get this by just casting to int, as you would with a float:

fpm::fixed_16_16 sweep = /*...*/;
const int count = static_cast<int>(sweep);

fpm will internally do the shift conversion for you. The point of fpm is that you don't have to worry about such internal details of fixed-point numbers and can just work with the type as-if it were a floating-point number.

Does this information help you?

dismine commented 2 years ago

it looks like you want to get the integral part of the fixed-point number. You can get this by just casting to int, as you would with a float:

No. According to the algorithm I have been implementing, I need to shift an internal integer. Conversion to integer in your implementation includes division m_value / FRACTION_MULT. I need access to a whole number represented by the internal number.

fpm::fixed emulates floating-point types. Shift operators don't make sense for those, and they don't make sense for fpm::fixed either.

Okay. But it is strange for me why then shift used in this algorithm. Again, I am new to the concept of such floating-point types. Most probably, I don't understand something. But while it works in my case, I will keep it my code. Just thought maybe it will be interesting for you to look on the algorithm from the document I have mentioned. There shift works.

just work with the type as-if it were a floating-point number.

Thanks, I understand your idea.

BTW, your code produces a lot of warnings with clang-tidy.

Does this information help you?

Yes, thank you. We can close the issue.

MikeLankamp commented 2 years ago

Thanks for that link, that helps. The code in that paper assumes you are doing fixed-point arithmetic yourself. fpm aims to hide that from you.

For instance, the paper presents the following function to implement the circle generator, with FIXED being an int32_t that holds a 16.16 fixed-point number:

inline void CircleGen(FIXED& u, FIXED& v, int k)
{
u -= v >> k;
v += u >> k;
}

This implements the circle generator equations, where ε = 1/2k

un = un-1 − ε vn-1 vn = vn-1 + ε un

or, substituting ε:

un = un-1 − vn-1 / 2k vn = vn-1 + un / 2k

In fixed-point maths, dividing a fixed-point number by 2k is implemented by shifting the number's integer representation right by k bits. This is what their original CircleGen method is doing.

To do this with fpm, you'd write it as you would with floating-point numbers:

inline void CircleGen(fpm::fixed_16_16& u, fpm::fixed_16_16& v, int k)
{
    auto e = fpm::exp2(-k);   // A ldexp version would be better, but fpm doesn't support that (yet)
    u -= v / e;
    v += u / e;
}

If you want to use the original paper's code, you don't need fpm. Just use using FIXED = int32_t.

I hope this helps you forward.

dismine commented 2 years ago

Previously you said

fpm::fixed emulates floating-point types. Shift operators don't make sense for those, and they don't make sense for fpm::fixed either.

But I still don't get why it makes no sense for fpm::fixed to have support for shifts. In my case I use fpm::fixed_16_16. Internally, it is std::int32_t. And shifts work perfectly fine for std::int32_t. Shifts are no more than equivalent to multiplying/dividing x with 2^y (2 raised to power y). It doesn't mean it is impossible for floating-point numbers. And in case of this tricky conversion to internal integer, it becomes possible.

The code in that paper assumes you are doing fixed-point arithmetic yourself. fpm aims to hide that from you.

Exactly. To debug my code, I tried the direct approach. Using fpm::fixed is much cleaner. Especially in case of division and multiplication. It is very easy to forget how to do it in case of FIXED type.

If you want to use the original paper's code, you don't need fpm. Just use using FIXED = int32_t.

I do not agree. Shifts work perfectly fine with fpm::fixed in my case. If you do not try to convert it to integer and instead use raw_value.

For me still strange to hear that fpm::fixed should not support shifts, but at the same time using FIXED is okay.

  1. Shifts are no more than equivalent to multiplying/dividing x with 2^y (2 raised to power y).
  2. This is language restriction, not mathematical restriction.
  3. Conversion to internal integer solves the issue.
  4. The algorithm in the document shows that FIXED (manual version of fpm::fixed) perfectly suited for shifts.

Let's try a small experiment.

double n = 28.634;
fpm::fixed_16_16 xN{28.634};

// n <<= 7; // error
n = n * pow(2, 7); /* 3665.152 */

xN <<= 7;
double res = static_cast<double>(xN); /* 3665.15234375 */

As you can see, the results are pretty close. And if round to three decimal places we will get the same result basically.

Of course, because fpm::fixed is just emulation for floating-point types, we must deal with accuracy of such type. But if it's fine for multiplying and division, why not for shifts. In some cases such accuracy more than enough.

MikeLankamp commented 2 years ago

But I still don't get why it makes no sense for fpm::fixed to have support for shifts. [...] shifts work perfectly fine for std::int32_t.

For built-in types, shift operators are only defined for integral types. You can't shift a float. fpm::fixed pretends to be a float, so you shouldn't be able to shift it.

The fact that internally it implements its behavior with int32_t is an implementation detail. This is mostly hidden from the user of fpm::fixed (granted, it's a leaky abstraction, but we shouldn't make it worse).

Shifts are no more than equivalent to multiplying/dividing x with 2^y (2 raised to power y).

Only for fixed-point implementations using integral types that use bit shifts. (Also, for signed integers before C++20 there's some risk of undefined behavior if the result isn't representable in the type.)

It doesn't mean it is impossible for floating-point numbers.

Shifting is not allowed for floating-point numbers, actually.

To summarize: The paper has optimized its code to directly manipulate a 16.16 fixed-point representation using int32_t where shifts make sense. fpm::fixed provides an abstraction on top that where shifts do not make sense. If you want to use fpm::fixed, you have to work with that abstraction.

So your alternatives are: