LLNL / metall

Persistent memory allocator for data-centric analytics
Other
53 stars 13 forks source link

Contributions in logging and file handling #297

Closed bigerl closed 9 months ago

bigerl commented 1 year ago

We've done overhauls in the two areas logging and file handling, and would like to contribute them back:

Both are breaking changes because interfaces are affected and we had to raise the C++ standard to C++20.

Would you be interested to integrate them?

KIwabuchi commented 1 year ago

Hi @bigerl,

Thank you for the suggestions. I'm interested in them! I'll look into the code and get back to you.

Which feature in C++ 20 is used? I'm wondering which GCC and Clang versions are required.

liss-h commented 1 year ago

Hi there, Its mostly C++-17 except that we are using <format> for the logging, so very recent. But maybe you'd like some of the changes to file handling, they shouldn't rely on any recent C++-standard.

Having another pair of eyes look at that would probably help as well :)

judicaelclair commented 1 year ago

Hi all, given that this would require C++20, I'd strongly suggest using std::source_location instead of __PRETTY_FUNCTION__ so that the log sink has maximum flexibility in the output format. And more to do with personal preference, I would pass the formatted message as a std::string&& / std::string to metall_log instead of prematurely converting it to char const*. If metall is then used in a C environment, this would simply require a small wrapper to metall_log that converts the arguments to char const* as they are currently.

Overall, I'd change metall_log to void metall_log(metall_log_level lvl, std::source_location loc, std::string&& msg);

For maximum portability, if <source_location> is not available, you can then fallback to __PRETTY_FUNCTION__, and I'd also allow the user to decide on the formatting library. That is, instead of using <format>, it should be possible to use a user-provided fmt library (e.g. the latest fmt library or the one bundled with spdlog). Since the formatting is straightforward, this should simply require conditionally changing header inclusion and namespace via macros. spdlog does exactly this, e.g. see include/spdlog/fmt/fmt.h.

liss-h commented 1 year ago

Thank you for your input. Optionally using fmt and conditionally using source_location in the top-level function sound like good ideas to me, thx.

But I also wanted to give some context on the current signature of metall_log. The reason why it is a pure C function is that we use metall (among other things) in other languages (e.g. Rust) using a thin C wrapper.

For that purpose we want to be able to log metall messages directly into the "native" logging framework of the respective language. The way we do this is defining metall_log in the target language, which only really works in most languages if its pure C. Thats also why there is only a declaration of metall_log in metall itself by default (you can optionally link default_logger if you want the default implementation).

We are aware that this is a potentially very niche usecase, so not sure if that fits your way of doing things.

judicaelclair commented 1 year ago

To deal with the case when you need a C interface, you could do something like this:

#if defined(METALL_LOG_EXTERN_C)
#include "metall/logger_interface.h"

namespace metall {

static inline void do_log(const metall_log_level lvl,
                          const std::source_location loc, std::string&& msg) {
  ::metall_log(lvl,
               fmt::format("{}:{}:{} in `{}`", loc.file_name(), loc.line(),
                           loc.column(), loc.function_name()).c_str(),
               msg.c_str());
}

} // namespace metall
#else
namespace metall {

void do_log(const metall_log_level lvl, std::source_location loc,
            std::string &&msg); // user-implemented.

} // namespace metall
#endif

and define METALL_LOG_EXTERN_C. Metall itself would then always call metall::do_log instead of ::metall_log.

KIwabuchi commented 1 year ago

Hi everyone,

Thanks for the additional information and the new suggestion!

Compiler Version

Metall is sometimes used on systems that have only old compilers due to their system management policy, unfortunatedly. It looks like std::format requires GCC v13 and std::source_location requires v11. Let me do some investigation to find out the minimum compiler versions we need to support.

Logger

Optionally using fmt and conditionally using source_location in the top-level function sound like good ideas to me

I also like the ideas. I'm also hoping that we can come up with a way to provide a C interface at the same time:)

In addition, I'm wondering if we could move the default logger's definition into a header file to keep the 'header-only' library design. I want to make Metall work nicely with other languages. At the same time, I also want to make the default Metall implementation easy to use for users.

File handling

@Clueliss Using std::path would be a good idea! I'll look deeply into the other changes and get back to you.

liss-h commented 1 year ago

Important update: as it turns out the copy_file_sparse_linux(int, int) function is not correctly implemented right now, as it does not restore the holes in the file (which changes the offsets in the file).

So after a snapshot metall cannot read the datastore properly anymore. We will be reverting to cp --sparse=always to fix it now and will reimplement it correctly at a later point.

Sorry about that, will update you once the new version is implemented.

KIwabuchi commented 1 year ago

@Clueliss Sounds good! No program at all!

bigerl commented 9 months ago

Changes were integrated with #312 and #313. Closing this.