msgpack / msgpack-c

MessagePack implementation for C and C++ / msgpack.org[C/C++]
Other
3.03k stars 882 forks source link

Proposal: Modify zones to accept custom malloc(), realloc() and free() functions #768

Open johnwbyrd opened 5 years ago

johnwbyrd commented 5 years ago

The zone system for msgpack seems very useful to me. However, in my case, I must implement msgpack with another memory management system.

It seems to me that another struct may be added to the zone definition, maybe called "zone::os" or something like that. You would create a struct for providing function pointers, something like this:

struct {
  void *(custom_malloc)(size_t)
  void *(custom_realloc)(void *, size_t)
  void *(custom_free)(void *)
}

And then, the zone would use those functions instead of the default. We would of course provide a zone::os_default that would be used by default, which would use the standard malloc() and free() and realloc() functions.

What is your opinion on this idea? If you like it, I may implement it for you. If you want this, please let me know whether I should add it to the v1, v2 or v3 C++ header files. Thank you very much.

redboltz commented 5 years ago

Which part of msgpack-c are you talking about? C, C++, or both?

You wrote zone::os_default, so I think your proposal is for C++. Is that right?

johnwbyrd commented 5 years ago

Yes, of course, my proposal is for the C++ implementation. What are your thoughts? I see no reason it cannot be done for the C implementation as well, but that is not my work focus right now... I could do a C implementation in a month or two if you wish.

redboltz commented 5 years ago

Ok. Basically I agree with you. That means I think that we need some custom allocator. Some of embedded environment might not have malloc/realloc/free.

NOTE: Due to realloc, that is for efficient memory expansion, msgpack-c doesn't use C++ standard allocator.

I think that there is two issues that we should solve.

  1. malloc is used by not only zone, but also unpacker(parser) and buffers (sbuffer, zbuffer, and vrefbuffer). See https://github.com/msgpack/msgpack-c/search?q=malloc&type=Code. I think that we also should provide custom allocator for them.

  2. How to keep backward compatibility. The first malloc could be called in the constructor. So, we can't add the new member function that is setting custom allocator. I think that adding template parameter is good choice. That doesn't use pointer of functions. It is inline friendly.

Here is the code image of my concept:

For zone,

template <typename Allocator>
class basic_zone {
};

typedef basic_zone<msgpack::default_allocator> zone;
redboltz commented 5 years ago

I noticed that some part of msgpack-c core, not including adaptors, uses std::vector with default allocator. See https://github.com/msgpack/msgpack-c/blob/master/include/msgpack/v2/parse.hpp#L213 It finally calls malloc. In this case, we can provide the custom global new but I think that it is not elegant. So I think that we should provide allocator propagating mechanism for that. The allocator concept is C++ standard alloocator concept. It is different from malloc/realloc/free one.

johnwbyrd commented 5 years ago

You have a lot of important points here. I understand, we should keep realloc() for performance. But as you know, std::allocator does not understand the concept.

And you are also correct, we should think through the C case.

For the C case, I suggest that we create a zone_allocator struct like I propose above, and change msgpack_zone to rely upon it.

For the C++ case, I suggest a msgpack C++ allocator, derived from std::allocator_traits, but also including a realloc() function.

We would implement at least two templates derived from this base. The default implementation would rely on malloc(), realloc(), and free() as now. But, we would also provide a version that takes a normal C++ allocator, and implements a realloc() function based on always copying the old memory to a new allocation.

That way, you can use whatever C++ allocator you want, and we would provide our own realloc() function for you, if you don't want to write one. However, most code would want the default optimized msgpack allocator.

If we want to be fancy, we can even create a third template, that accepts a C style msgpack_zone struct during construction, and uses the same zone_allocator as C. Then, you could use the same allocator for both C and C++. I don't need this, but it is possible,

You are also correct, I did not see all the changes that would be necessary to implement this, so it is a big testing job.

If you don't have time, if you like, I will be able to work on this in one month.

Sorry for bothering you about this, but I must implement this for my own use, and with your advice, we might implement it correctly for all use cases, not just mine.

Please give me your thoughts. Thank you very much.

redboltz commented 5 years ago

I don't have much time to implement that big change. If you will do that, I would appreciate!

For C case, I think your approach is good.

I wrote a small code to clarify the concept. Please check it. https://wandbox.org/permlink/STJvUqTTMiaf1EHk Compile success, link error (I don't implement the body of the function)

For C++ case, I will write a comment later.

redboltz commented 5 years ago

For C++ case, I don't get the result what is the best approach.

Here are some points that I want to share what I thought:

  1. msgpack-c needs to support C++03, C++11 and later.
  2. std::allocator_traits is introduced since C++11 http://www.cplusplus.com/reference/memory/allocator_traits/?kw=allocator_traits.
  3. std::allocator has been updated during C++03, 11, and 14. http://www.cplusplus.com/reference/memory/allocator/?kw=allocator
  4. We need to watch realloc() for C++ standard process. See http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0894r1.md. It might be a hint.
  5. Boost.Container already supports realloc() like function https://www.boost.org/doc/libs/1_69_0/doc/html/container/extended_allocators.html. It uses modified DLmalloc https://github.com/boostorg/container/tree/develop/src
  6. msgpack-c should work without boost.

I think that newly introduced allocators are two different kind as follows:

category target data type note
A zone char replacing malloc/realloc/free
A unpack, parse char replacing malloc/realloc/free
A sbuffer, vrefbuffer char replacing malloc/realloc/free
B v1 unpack std::vector adding allocator provider
B v1 object std::vector adding allocator provider
B v2 parse std::vector adding allocator provider
B v2 x3_parse std::vector adding allocator provider
B v3 parse std::vector adding allocator provider
B v2 create_object_visitor std::vector adding allocator provider
B v1 cpp03 define_map std::map adding allocator provider
B v1 cpp11 define_map std::map adding allocator provider

I think that for category B, std::allocator or std::allocator_traits based approach is good. It doesn't require realloc.

For category A, it requires realloc(). But all of their target type is char(byte). Because they treat sending buffer, receiving buffer, and memory pool(zone). std::allocator is more flexible. It takes a template parameter T. It provides rebind mechanism. I think that it is too much for category A.

Here is implicit concept base approach for category A. https://wandbox.org/permlink/u32pvBykXJhWFD1W

What do you think?

redboltz commented 5 years ago

My C++ concept code is a little bit misleading. I updated it. https://wandbox.org/permlink/VhOqg2zzisic4cB5

redboltz commented 5 years ago

Ok, I will review. C++ version first is OK. I will merge only C++ version. But msgpack::zone only is not good. I want to merge all C++ custom allocator support at once.

Do you have any other requests for me?

All classes that support custom allocator should provide allocator getter as follows: https://wandbox.org/permlink/eO65x9VZHSDL2IFH

Thank you for cooperation :)

johnwbyrd commented 5 years ago

Thank you so much for thinking this through with me. I love your analysis of A versus B type allocations. Also, I think your design for zone allocators of type A is correct.

I suggest that we should divide the work into 1) necessary work and 2) optimizations.

Your design for type A allocations is simple and straightforward, let's do that.

For section B, I suggest we provide a msgpack std::allocator that does not depend on allocator_traits, but instead is based on your default_allocator. Then, both type A and type B allocations would call the same core code. There are many examples to start from: http://www.josuttis.com/cppcode/myalloc.hpp.html

If we wish, we can do the same for C++11 and have it rely on std::allocator_traits, but this may not be strictly necessary.

And of course, the core code must be completely changed to only use these allocators.

Once this works and passes your stress tests, we can optimize pathways for Boost.

I think it is reasonable to assume that if you are managing your own memory, you are much more likely to want to manage it via your default_allocator interface, rather than through some custom std::allocator. High-level C++ programmers don't care so much about memory management. It is us low level guys who want to manage all the memory details directly.

I would be happy to follow your header file sketches for type B allocations, or I can design them myself.

Thank you very much for your kind consideration.

redboltz commented 5 years ago

Thank you so much for thinking this through with me. I love your analysis of A versus B type allocations. Also, I think your design for zone allocators of type A is correct.

I suggest that we should divide the work into 1) necessary work and 2) optimizations.

Your design for type A allocations is simple and straightforward, let's do that.

Sounds good.

For section B, I suggest we provide a msgpack std::allocator that does not depend on allocator_traits,

I think that using std::allocator is reasonable. Ok let's reject std::allocator_traits.

but instead is based on your default_allocator. Then, both type A and type B allocations would call the same core code. There are many examples to start from: http://www.josuttis.com/cppcode/myalloc.hpp.html

I don't think type A default_allocator, only for allocate/reallocate/deallocate, is suitable for section B. The point of the section B is providing custom allocator for standard containers such as std::vector and std::map. So the default allocator should be std::allocator. Users can prepare their own custom allocator such as http://www.josuttis.com/cppcode/myalloc.hpp.html.

I'm not sure I could explain what I want to say. So I wrote quick and incomplete (but successfully compiles and pass all tests on C++11 environment) code.

https://github.com/msgpack/msgpack-c/compare/master...redboltz:issue_768_allocator_b

I hope my branch is good hint for implement the custom allocator section B.