Open jonesmz opened 5 years ago
Michael, thanks for the issue!
Some of your requests are possible to implement and they are really needed. I`ll keep this issue as a guideline for documentation improvement.
Requests about connection type and transaction are very related to the current refactoring and should be ready in this quarter for sure.
Requests about examples 3 and 8 are out of the library scope - IMO these are subject for self-education. I doubt that we will spend some resources here. Anyway if you want to contribute to that part - as always I'd be glad to see and help with PR.
And I'd just wondering - do you need some explanation on one of these topics right now or it is just a suggestion to help other people? Because if you have any questions with not well-documented functionality I would love to give answers for you.
Are you sure you meant that example 3 is out of library scope?
Example 3 is about when to use ozo::connection_pool
versus ozo::connection
, which seems like something that the project should provide more documentation on since the only way to learn about these topics would be to ask the project contributors, or read the documentation / source code.
Perhaps you meant that example 4 is out of scope? That's the one about storing the results of a callback into a member variable.
I agree with you that it's not directly pertinent to the OZO code, but do keep in mind that the more valid examples that are provided to users of the library, the easier it is for people to make connections in their mind about the appropriate use of the library. By providing an example like that, you help people to more immediately see how to use the OZO library in their existing codebase. There are undoubtedly thousands of projects that use libpqxx in codebases that are not easily refactored into asyncronous code, and sometimes one additional valid example is all those projects need to fully understanding how to convert to OZO.
The reason why I point out example 8 is because Boost.Asio has several examples that use std::bind
(Or boost::bind
, in their case), and in fact has several helper functions to make it easier to use std::bind
with Boost.Asio. Having an example that is 80% the same as another, with the only major difference being the callback is a member function, instead of a lambda, would be sufficient.
I'm not making any promises here, but if the core contributors of OZO are able to provide examples for my other requests, I would likely be able to submit examples for 4 and 8.
do you need some explanation on one of these topics right now
Yes, actually. Thank you for the offer. I have three pressing questions that are preventing me from adopting OZO.
1) In my code, I don't want to care about how the connection is established. I have a URI connection string, and simply want the library to figure it out (Including re-connecting on failure) and only report back if it can't re-connect on failure several times. (In which case, my application will simply log failure and exit).
Is this what ozo::connection_pool
will do for me? Is there a strong reason not to use ozo::connection_pool
, and use ozo::connection
instead? I'm not particularly sensitive about performance. The switch from synchronous libpqxx code to async OZO code will already alleviate a huge bottleneck so I don't care about other minor performance concerns at this time.
2) How should I manage the lifetime of ozo::connection
and/or ozo::connection_pool
? In my code, these will be passed to multiple objects, and each of these objects will create and execute queries in arbitrary fashion. The easiest thing for me would be to simply pass ozo::connection
or ozo::connection_pool
by value to each of the objects that needs to access the connection. But if that's not a good idea, I'll just make them std::shared_ptr
instead.
3) Lifetime of parameters. If I do this:
using namespace ozo::literals;
auto query = "select * from table where column="_SQL + std::string_view(SomeStdStringHere);
Will OZO copy the string contents? Or assume that I'm keeping SomeStdStringHere
alive?
If OZO will copy the string contents, this is an important performance consideration and should be pointed out in the HowTo, or somewhere.
If OZO will not copy the string contents, and instead requires that the lifetime of the SomeStdStringHere
variable be managed, I'll just move SomeStdStringHere
into std::shared_ptr<std::string>
and pass the shared_ptr
into the callback, to keep the data alive.
My recommendation is as follows:
Provide some ozo::copy_string_parameter
type that explicitly copies the string data and keeps it alive for the lifetime of the query. E.g.
using namespace ozo::literals;
auto query = "select * from table where column="_SQL + ozo::copy_string_parameter(SomeStdStringHere);
If the ozo::copy_string_parameter
concept is not used, then the default behavior is to assume the lifetime of the provided parameter extends past the lifetime of the query.
But in either case, I need to know what the current behavior of OZO is so I can start using it.
Added
Is there any way to have ozo automatically manage the storage for the result? In libpqxx, the query returns a "row" object, which you can use to fetch data with raw string literals to map names to data. A facility in OZO to allow the library to automatically manage the storage for the results, and provide them as parameters to the callbacks, would be very helpful for end users who are either migrating from libpqxx, or who just don't care about this aspect of performance.
To my list of requested examples.
If there's no existing facility, then I can create a new issue to request this as a feature.
Ok, I try to answer the most important questions for you now and continue the dialog about examples a little bit later if you don't mind.
- In my code, I don't want to care about how the connection is established. I have a URI connection string, and simply want the library to figure it out (Including re-connecting on failure) and only report back if it can't re-connect on failure several times. (In which case, my application will simply log failure and exit).
Here is an example of the request with retries examples/retry_request.cpp. In the example, connection_info
is used as a connection source, but the connection_pool
may be used as well if you need a connection pooling to avoid connection establishing each time you want to get one. The pool creates new one on demand.
Notice: the library do not support multihost connection string, but it may be implemented via retries as it done in tests
- How should I manage the lifetime of ozo::connection and/or ozo::connection_pool ? In my code, these will be passed to multiple objects, and each of these objects will create and execute queries in arbitrary fashion. The easiest thing for me would be to simply pass ozo::connection or ozo::connection_pool by value to each of the objects that needs to access the connection. But if that's not a good idea, I'll just make them std::shared_ptr instead.
This is good question.
ozo::connection
and ozo::connection_pool
are non-copyable and a user should care about the lifetime of the objects.
ozo::connection_pool
It is ok to share it via shared_ptr
, but in most cases the lifetime of the object is well determined and related to the lifetime of an application. From my experience it is better to do not use shared ownership and share the object via refernce or pointer since you can guarantee its lifetime. The lifetime of connection pool object should be sorter than io_context objects are used with connections. Since connections in the pool still have references on io_context objects, connections should be destroyed first.
ozo::connection
In current implementation ozo::connection_info
and ozo::connection_pool
provide std::shared_ptr
of ozo::connection
and ozo::detail::pooled_connection
respectively. This behavior should be changed until the first stable version. The aim is to produce non-copyable objects form the connection sources (connection_info, connection_pool) since the shared connection object approach is an error-prone thing in general. It was a big mistake to produce shared pointer on connection by default. So the main pattern for callback-style usage is:
auto res = std::make_unique<ozo::rows_of<...>>();
ozo::request(conn_info[io], query, 1s, ozo::into(*res),
[res=std::move(res)] (ozo::error_code ec, auto conn) {
if (ec) {
log_error(ec);
return;
}
process_result(*res);
ozo::execute(std::move(conn), query2, 1s, [] (ozo::error_code ec, auto conn) {
// ...
});
}
So it is better to handle connection as non-copyable to be ready for the breaking changes in the future. But if you want to share the connection object for any reason it is up to you to make it shared. std::shared_ptr<ozo::connection>
would be interpreted as a model of Connection concept that can be used with any function related to the Connection concept. For now I'd recommend to check the connection type of the source that is used to be robust to the possible upcoming changes and wrap the connection into std::shared_ptr
as it is needed. The connection type may be obtained from connection source via ozo::connection_type.
using connection_type = ozo::connection_type<decltype(conninfo)>;
Lifetime of parameters. If I do this:
using namespace ozo::literals; auto query = "select * from table where column="_SQL + std::string_view(SomeStdStringHere);
Will OZO copy the string contents? Or assume that I'm keeping SomeStdStringHere alive?
Query builder uses copy semantics (as default for C++ in general), so in your case it contains a copy of std::string_view
with possible dangling reference depending on SomeStdStringHere lifetime. So a user should use std::string
for copy. Or a user may apply std::ref
/std::cref
to avoid a copy. In that case a user should care about an object lifetime and avoid dangling references.
Is there any way to have ozo automatically manage the storage for the result?
The library assumes standard solutions like stack placement and std::make_unique
or std::make_shared
to allocate the result container object and control its lifetime. We avoid to pass a result in the operation callback. We have such approach in out internal solution - and it is not convenient especially for stackless coroutines emulation from Boost.Asio due to different callback signatures. It is a different kind of the interface and I do not think we will spend resources to support it, at least for now. This is a recource managment decision because we have more priority tasks like single-row mode support or notification support and so on.
Thanks for the concrete questions!
Hope that helps.
Is ozo::connection_pool
a connection provider?
I'm trying to pass a ozo::connection_pool
to the ozo::request
function, and am getting a static assertion.
Btw, in your example:
ozo::request(conn_info[io], query, 1s, ozo::into(*res),
[res=std::move(res)] (ozo::error_code ec, auto conn) {
The use of res
is undefined behavior. The evaluation order of function parameters is platform specific. The std::move()
operation for the lambda may occur prior to operator*
This is one of the reasons that requiring the user of the API manage storage of the result is undesirable. It's very easy to make a mistake.
Re-reading your explanations, I'm starting to get the impression that the expected usage of ozo is to create a new ozo::connection
object for each request. Do I understand correctly?
Does this mean that a new connection is established to the database server for each ozo::connection
? Including the authentication handshake?
I hope I am incorrect, as that seems very inefficient.
Could you clarify ?
Query builder uses copy semantics (as default for C++ in general), so in your case it contains a copy of std::string_view with possible dangling reference depending on SomeStdStringHere lifetime. So a user should use std::string for copy. Or a user may apply std::ref/std::cref to avoid a copy. In that case a user should care about an object lifetime and avoid dangling references.
This information needs to go into the howto.
In fact, the howto probably should avoid using integer parameters entirely, so that it is more clear what the lifetime semantics of OZO are.
Hi, Michael.
I've stared to work on examples. Look at my pr draft: https://github.com/yandex/ozo/pull/192 . So far only two new examples using asio::use_future
completion token and callback function. You haven't mentioned this explicitly but these are possible replacements for Boost.Coroutine.
Is
ozo::connection_pool
a connection provider?I'm trying to pass a
ozo::connection_pool
to theozo::request
function, and am getting a static assertion.
It's not, look at this example. ozo::connection_pool
is needed to be bind to io_context
to create a connection provider object.
Re-reading your explanations, I'm starting to get the impression that the expected usage of ozo is to create a new
ozo::connection
object for each request. Do I understand correctly?Does this mean that a new connection is established to the database server for each
ozo::connection
? Including the authentication handshake?I hope I am incorrect, as that seems very inefficient.
Could you clarify ?
That's where ozo::connection_pool
helps. It caches all valid connections and allows to reuse them. If a connection provider is used based only on ozo::connection_info
then a connection will be established on first ozo::request
or other operation call. Connection will be closed only on object destruction. So it's possible to reuse this object and underlying connection at least until first failed operation.
It's not, look at this example. ozo::connection_pool is needed to be bind to io_context to create a connection provider object.
Thank you for clarifying.
This is not at all clear in the existing documentation, or by reading the code.
I'm also a little frustrated at this design choice. I don't understand why a connection pool of "live" connections would need me to provide an io_context. My initial understanding was that the pool itself would be bound to an io_context, and any connections provided from it would come from that io_context.
I'm able to get closer to using OZO now that I understand this, however.
EDIT:
Nevermind the below, I found execute.h, which does what I want.
I see in request.h
that all of the overloads of request_op::operator()
require an Out
parameter.
Is there any support for queries that return no results?
E.g. in libpqxx there are exec0
and exec_params0
functions that know the end user is expecting no return values.
auto res = std::make_unique<ozo::rows_of<...>>();
ozo::request(conn_info[io], query, 1s, ozo::into(*res),
[res=std::move(res)] (ozo::error_code ec, auto conn) {
if (ec) {
log_error(ec);
return;
}
process_result(*res);
}
I ended up making a helper function to automatically allocate storage for me.
template<typename RETURN_T, typename CONNECTION_T, typename QUERY_T, typename HANDLER_T>
decltype(auto) execute_db_query(CONNECTION_T && conn, QUERY_T && query, HANDLER_T && handler)
{
auto pRet = std::make_unique<RETURN_T>();
auto into = ozo::into(*pRet);
return ozo::request(std::forward<CONNECTION_T>(conn),
std::forward<QUERY_T>(query),
std::move(into),
[pRet = std::move(pRet), handler = std::move(handler)]
(auto ec, auto conn) mutable
{
handler(std::move(ec), std::move(conn), std::move(*pRet));
});
}
Added another requested example:
- An example of mapping a complex type, like nlohmann::json's json object type, to the result of a query. The existing how-to shows how to map data from C++ -> SQL, but not the other way around.
auto res = std::make_unique<ozo::rows_of<...>>(); ozo::request(conn_info[io], query, 1s, ozo::into(*res), [res=std::move(res)] (ozo::error_code ec, auto conn) { if (ec) { log_error(ec); return; } process_result(*res); }
I ended up making a helper function to automatically allocate storage for me.
template<typename RETURN_T, typename CONNECTION_T, typename QUERY_T, typename HANDLER_T> decltype(auto) execute_db_query(CONNECTION_T && conn, QUERY_T && query, HANDLER_T && handler) { auto pRet = std::make_unique<RETURN_T>(); auto into = ozo::into(*pRet); return ozo::request(std::forward<CONNECTION_T>(conn), std::forward<QUERY_T>(query), std::move(into), [pRet = std::move(pRet), handler = std::move(handler)] (auto ec, auto conn) mutable { handler(std::move(ec), std::move(conn), std::move(*pRet)); }); }
That's a good choice. The only thing I suggest for you is using a functional object instead of lambda, to preserve original handler associated executor and allocator:
template <typename RETURN_T, typename HANDLER_T>
struct result_wrapper {
std::unique_ptr<RETURN_T> pRet = std::make_unique<RETURN_T>(); // Even better here to use handler associated allocator
HANDLER_T handler;
template <typename Connection>
void operator (ozo::error_code ec, Connection&& conn) {
handler(std::move(ec), std::move(conn), std::move(*pRet));
}
using allocator_type = boost::asio::assiciated_allocator_t< HANDLER_T >;
allocator_type get_allocator() const { return boost::asio::get_associated_allocator(handler); }
using executor_type = boost::asio::associated_executor_t< HANDLER_T >;
executor_type get_executor() const { return boost::asio::get_associated_executor(handler); }
}
I'm also a little frustrated at this design choice. I don't understand why a connection pool of "live" connections would need me to provide an io_context. My initial understanding was that the pool itself would be bound to an io_context, and any connections provided from it would come from that io_context.
It was made to be able to use the established connection with several io_context objects. E.g., for a multithreaded application, it is better to use an io_context instance per thread. If the connection_pool is bound to the certain io_context instance - it is impossible to use an established connection within different threads, and this leads to the number of connection amplification by the number of threads. Connections are not free even with using external poolers like pgpool
. So with the current design, the connection pool may rebind connection to a different io_context as it needed.
That's a good choice. The only thing I suggest for you is using a functional object instead of lambda, to preserve original handler associated executor and allocator:
Do you think this is a helper function that could be included into ozo?
E.g. ozo::easy_request
, or something similar.
Here's an allocator aware result_wrapper struct, based on your suggestion.
template <typename RETURN_T, typename HANDLER_T>
struct result_wrapper
{
using executor_type = boost::asio::associated_executor_t<HANDLER_T>;
using allocator_type = typename boost::asio::associated_allocator<HANDLER_T, std::allocator<void>>::type;
using handler_alloc_traits_t = std::allocator_traits<allocator_type>;
using return_alloc_t = typename handler_alloc_traits_t::template rebind_alloc<RETURN_T>;
using return_alloc_traits_t = std::allocator_traits<return_alloc_t>;
result_wrapper(HANDLER_T handler)
: m_handler(std::move(handler))
, pRet([this](void)
{
// Create an allocator for the return value by rebinding the provided allocator via copy-construction.
return_alloc_t retAlloc(get_allocator());
auto* const p = return_alloc_traits_t::allocate(retAlloc, 1);
try
{
return_alloc_traits_t::construct(retAlloc, p);
}
catch(std::exception const& ex)
{
return_alloc_traits_t::deallocate(retAlloc, p, 1);
throw ex;
}
return p;
}())
{ }
executor_type get_executor() const { return boost::asio::get_associated_executor(m_handler); }
allocator_type get_allocator() const { return boost::asio::get_associated_allocator(m_handler); }
template <typename Connection>
void operator()(ozo::error_code ec, Connection&& conn)
{
m_handler(std::move(ec), std::move(conn), std::move(*pRet));
}
HANDLER_T m_handler;
std::unique_ptr<RETURN_T> pRet;
};
template<typename RETURN_T, typename CONNECTION_T, typename QUERY_T, typename HANDLER_T>
decltype(auto) execute_db_query(CONNECTION_T && conn, QUERY_T && query, HANDLER_T && handler)
{
result_wrapper<RETURN_T, HANDLER_T> wrapper(std::forward<HANDLER_T>(handler));
auto into = ozo::into(*(wrapper.pRet));
return ozo::request(std::forward<CONNECTION_T>(conn),
std::forward<QUERY_T>(query),
std::move(into),
std::move(wrapper));
} // execute_db_query
Though this implementation has a significant flaw in that it doesn't provide a custom deleter to std::unique_ptr that uses the allocator to destruct and deallocate the results.
Hi, Michael!
Though this implementation has a significant flaw in that it doesn't provide a custom deleter to std::unique_ptr that uses the allocator to destruct and deallocate the results.
You may use a custom deleter similar to this one (but more accurate :) ) or you may dive deeper into this proposal or something similar:
struct deleter {
bool constructed_ = false; // indicates if the object has been constructed
return_alloc_t alloc_;
deleter(return_alloc_t alloc) : alloc_(alloc) {}
void operator () (T* v) const {
if (constructed_) {
return_alloc_traits_t::destroy(alloc, v);
}
return_alloc_traits_t::deallocate(alloc, v);
}
};
//...
result_wrapper(HANDLER_T handler)
: m_handler(std::move(handler)),
pRet(return_alloc_traits_t::allocate(get_allocator(), 1), deleter{get_allocator()}) {
return_alloc_traits_t::construct(retAlloc, p);
pRet.get_deleter(). constructed_ = true;
}
Do you think this is a helper function that could be included into ozo?
E.g. ozo::easy_request, or something similar.
I can not promise you anything here since the helper is only useful for the callback-style interface and this is a problem now. One of the main ideas is to provide the universal asynchronous interface that supports all the completion token types from Boost.Asio. This approach allows significant reducing the support cost of the project. I do not want to say "no", because you are interested in solving this problem and from my experience, there is a chance that in the future this functionality may be requested by a significant amount of users. In this case, it worth to provide such interface out-of-the-box. So, I propose you next:
At least you can provide a callback support library which we can recommend to use with OZO for people who need it.
Hope that helps.
This is how I implemented the deleter. Unfortunately your example wasn't complete. allocator_traits<>::allocate require lvalue-reference to allocator, so can't be used in the manner you demonstrated.
template <typename RETURN_T, typename HANDLER_T>
struct result_wrapper
{
using executor_type = boost::asio::associated_executor_t<HANDLER_T>;
using allocator_type = typename boost::asio::associated_allocator<HANDLER_T, std::allocator<void>>::type;
using return_alloc_t = typename std::allocator_traits<allocator_type>::template rebind_alloc<RETURN_T>;
using return_alloc_traits_t = std::allocator_traits<return_alloc_t>;
/*
* TODO: Provide SFINAE specialization for non-stateful deleters.
*/
struct deleter
{
void operator()(typename return_alloc_traits_t::pointer p)
{
return_alloc_traits_t::destroy(m_alloc, p);
return_alloc_traits_t::deallocate(m_alloc, p, 1);
}
return_alloc_t m_alloc;
};
using return_value_ptr_t = std::unique_ptr<typename return_alloc_traits_t::value_type, deleter>;
result_wrapper(HANDLER_T handler)
: m_handler(std::move(handler))
, m_pRet([this](void) -> return_value_ptr_t
{
// Create an allocator for the return value by rebinding the provided allocator via copy-construction.
return_alloc_t retAlloc(get_allocator());
auto* const p = return_alloc_traits_t::allocate(retAlloc, 1);
try
{
return_alloc_traits_t::construct(retAlloc, p);
}
catch(...)
{
return_alloc_traits_t::deallocate(retAlloc, p, 1);
throw; // rethrow exception
}
return {p, deleter{std::move(retAlloc)}};
}())
{ }
decltype(auto) get_executor() const { return boost::asio::get_associated_executor(m_handler); }
decltype(auto) get_allocator() const { return boost::asio::get_associated_allocator(m_handler); }
template <typename Connection>
void operator()(ozo::error_code ec, Connection&& conn)
{
m_handler(std::move(ec), std::move(conn), std::move(*m_pRet));
}
HANDLER_T m_handler;
return_value_ptr_t m_pRet;
};
template<typename RETURN_T, typename CONNECTION_T, typename QUERY_T, typename HANDLER_T>
decltype(auto) execute_db_query(CONNECTION_T && conn, QUERY_T && query, HANDLER_T && handler)
{
result_wrapper<RETURN_T, HANDLER_T> wrapper(std::forward<HANDLER_T>(handler));
auto into = ozo::into(*(wrapper.m_pRet));
return ozo::request(std::forward<CONNECTION_T>(conn),
std::forward<QUERY_T>(query),
std::chrono::seconds(5), // Wait this long for a connection, or result.
std::move(into),
std::move(wrapper));
} // execute_db_query
there is a chance that in the future this functionality may be requested by a significant amount of users.
Yes, callback based ASIO is
As OZO gains popularity, it is what the majority of new users will use when they first start.
So, I propose you next:
I may be able to. I can't promise anything. But maybe.
Thank you for discussing with me. I appreciate the help understanding ozo.
Boost.Coroutine is, clearly, an extremely powerful framework. However, I am in a situation where adopting Boost.Coroutine is not currently an option.
I would like to see examples that don't use coroutines, that accomplish similar things to what's found here: https://github.com/yandex/ozo/blob/master/examples/request.cpp
For my situation, I am attempting to use OZO to replace libpqxx. Both OZO and libpqxx use the PostgreSQL provided "libpq" to run the underlying database operations, so they should be equally reliable and safe to use. The major difference, of course, is that OZO is asynchronous, and I've identified libpqxx's synchronous behavior as a major bottleneck in my application.
My program is currently structured like this:
As you can see, I have some differences compared to the existing ozo examples.
So here's my wishlist of examples:
ozo::connection_info<>
,ozo::connection_provider<ozo::connection_info<>>
,ozo::connection_pool<ozo::connection_info<>>
. E.g., do I need to usestd::shared_ptr
? Or can I construct anozo::connection_pool<ozo::connection_info<>>
in main, and then pass it around by value? I would prefer not to pass it around by reference, so if I can't pass these by value, I would wrap them in astd::shared_ptr
.ozo::connection_pool
versus using anozo::connection
directly. Why would I not want to useozo::connection_pool
? If there's no reason why I would not want to, why exposeozo::connection
to the user at all?ozo::request
into a member variable of an object, and holding the lifetime of that object in the callback forozo::request
. -- I already know what I'm doing here. But there are probably lots of folks out there who wouldn't immediately see how to do it while reading the documentation.BEGIN
andEND
keywords inside an existing query, would help people understand the intended usage of OZO.LISTEN
/NOTIFY
functionality from PostgreSQL. To elaborate, in my application, I query the database for configuration information which is sent out on the network to other programs. I wish to listen to the database for changes to this configuration, and send re-configuration messages to the other applications over the network when that happens.std::bind
to call member functions of objects held bystd::shared_ptr
-- I already know how to do this. But there are probably lots of folks out there who wouldn't immediately see how to do it while reading the documentation."partone"_SQL + int64_t(5) + "parttwo"_SQL
;