msgpack / msgpack-c

MessagePack implementation for C and C++ / msgpack.org[C/C++]
Other
3.03k stars 883 forks source link

Revert "Merge pull request #1018 from GeorgFritze/cpp_master" #1144

Closed stephanlachnit closed 3 weeks ago

stephanlachnit commented 1 month ago

This reverts #1018, which introduces a "feature" to pack some floats as integers. I think this is a horrible idea for several reasons:

1. It is slow

msgpack should be small, but also fast. If I want to pack a large array of floats, I don't want that every single float is checked for NaN, or checked if it can possibility cast to an integer without precision loss (in total 4-7 checks for a single float).

2. It is unexpected

When I pack a float with pack_float, I also expect to unpack a float. For example, I pack a vector of floats somewhere and and want to unpack it to a vector of floats. I don't expect any other object than float in the msgpack array. Just looking at how often #1018 what referenced in other public projects on GitHub, this seemed to be an issue. It is even more unexpected when other msgpack implementations, such as Python, won't do this.

3. It is just bad design

When I code in C++ I want maximum speed, not 7 checks to save 3 bytes in maybe 0.1% of the cases. If I know I can get a large quantities of just zeros or integer-like values, I can implement zero-suppression or just cast it to an integer directly with my own logic.

If anything, a type change should only be behind an option like MSGPACK_COMPACT_FLOATS, but I actually think that removing it is much better. If you prefer making this optional, even default, let me know, I will add it.

redboltz commented 1 month ago

@stephanlachnit thank you for sending the PR.

I think that @stephanlachnit idea is reasonable. @GeorgFritze, if you have any objections, please post your idea here. I will wait a week, and if no objections I will merge the PR.

GeorgFritze commented 3 weeks ago

Yes, I think all your @stephanlachnit points are valid. Back then (and to some extent even now) I was working with very large double vectors and this decision somehow made sense in my little thought bubble. In hindsight, I think I unintentionally caused a lot more confusion and harm than good to anyone. I would like to apologise to everyone who was affected by this. I would somehow still be happy to see this ‘feature’ in as an opt-in flag. Since only 0.1% of users benefit from it, I understand if you decide against it.

redboltz commented 3 weeks ago

@GeorgFritze , thank you for the comment.

I will merge the PR.

In order to opt-in, I don't want to introduce a conditional preprocessor macro, because maintenanceability. But I noticed that the pack adaptor can specialize for double. You can use the following approach in your code:

code example

#include <iostream>
#include <msgpack.hpp>

#if 1 // you can enable/disable adaptor here

namespace msgpack {
MSGPACK_API_VERSION_NAMESPACE(MSGPACK_DEFAULT_API_NS) {
namespace adaptor {

template<>
struct pack<double> {
    template <typename Stream>
    packer<Stream>& operator()(msgpack::packer<Stream>& o, double v) const {
        if(v == v) { // check for nan
            // compare d to limits to avoid undefined behaviour
            if(v >= 0 && v <= double(std::numeric_limits<uint64_t>::max()) && v == double(uint64_t(v))) {
                o.pack_uint64(uint64_t(v));
                return o;
            } else if(v < 0 && v >= double(std::numeric_limits<int64_t>::min()) && v == double(int64_t(v))) {
                o.pack_int64(int64_t(v));
                return o;
            }
        }
        o.pack_double(v);
        return o;
    }
};

} // namespace adaptor
} // MSGPACK_API_VERSION_NAMESPACE(MSGPACK_DEFAULT_API_NS)
} // namespace msgpack

#endif

inline std::ostream& hex_dump(std::ostream& o, std::string const& v) {
    std::ios::fmtflags f(o.flags());
    o << std::hex;
    for (auto c : v) {
        o << "0x" << std::setw(2) << std::setfill('0') << (static_cast<int>(c) & 0xff) << ' ';
    }
    o.flags(f);
    return o;
}

int main() {
    double d1 = 1.23;
    double d2 = 2.00;
    msgpack::sbuffer buf1;
    msgpack::sbuffer buf2;
    msgpack::pack(buf1, d1);
    msgpack::pack(buf2, d2);

    hex_dump(std::cout, std::string{buf1.data(), buf1.size()}) << std::endl;
    hex_dump(std::cout, std::string{buf2.data(), buf2.size()}) << std::endl;
}

output

#if 0

0xcb 0x3f 0xf3 0xae 0x14 0x7a 0xe1 0x47 0xae
0xcb 0x40 0x00 0x00 0x00 0x00 0x00 0x00 0x00

#if 1

0xcb 0x3f 0xf3 0xae 0x14 0x7a 0xe1 0x47 0xae
0x02

NOTE: Please don't forget the compile and run the example code on https://github.com/stephanlachnit/msgpack-c/tree/p-cpp-revert-broken-float , the current cpp_master always return the result the same as #if 1.

simonspa commented 3 weeks ago

Thank you very much @redboltz for merging this fix! Would you mind also tagging a release which fixes this issue?

redboltz commented 3 weeks ago

Thank you very much @redboltz for merging this fix! Would you mind also tagging a release which fixes this issue?

Released as 7.0.0.