Closed damienhocking closed 3 years ago
Thanks. This is really useful.
Note it could be simplified a little by using mylogger- >set_pattern()
instead of creating formatters and passing them around.
Oh right. I was mucking around with custom formatters. I'll fix it.
On Jan 10, 2021, at 6:57 PM, Gabi Melman notifications@github.com wrote:
Thanks. This is really useful.
Note it could be simplified a little by using mylogger- >set_pattern() instead of creating formatters and passing them around.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
Edited to make it simpler.
Thanks. Please put this in the wiki (as a new page i think).
Done & closed.
@gabime I found this while trying to see if I can do structured logging with spdlog and this looks like a nice and simple way to add JSON logging, with fields that are statically known by the formatter (I can add custom formatter flags for example).
But one thing I'm not sure how to achieve (or if its possible at all) is to add contextual data to each log line, outside of the message. The reason is that when I do structured logging in other languages, I log a bunch of properties that should be their own fields in the JSON map: correlation-id, or just other machine-readable data relevant to that log line. For example, in something like python you might have log.info("Some message: {}", 123, context={'some-data': 456})
or similar.
Is there some way to do this with spdlog? From looking through the documentation, it looks like maybe not, at least not easily, but perhaps I overlooked something?
The best I can come up with is to simply follow some structure in the message string and parse it (either from the logs later, or in a custom formatter, depending on where I want to pay the price of parsing).
EDIT: It seems that structlog does what I want, but it achieves this by changing the spdlog API so that log calls return an object on which the extra data can be set. I worry that it will quickly fall behind spdlog's development as it seems merging with spdlogs changes will be hard.
@danielytics You are right. There is no straight forward way to achieve this since the API doesn't provide a way to pass extra context values.
The best I can come up with is to simply follow some structure in the message string and parse it
Another (ugly yet effective I think) way would be using some shared context object which would be updated when needed (e.g. before calling the log) and read by the formatter without the need to parse strings.
We're dealing with this in two ways. Our first way is a message that's just a string, with all the other stuff in it, including numbers, tags etc. Then you have a log that's json, but something somewhere else still has to parse that format. This is fragile, which is leading to the second way. We got the idea from our use of RavenDB with their "@metadata" tags in their document view and it maps pretty well to what we want to do with message interpretation. See the json doc in here:
https://ravendb.net/docs/article-page/4.0/commands/studio/database/documents/document-view
We're using a metadata tag to reference a type of message schema for different types of messages. Something else still has to do the work of interpreting the message but at least it's structured and can follow a rational format. We're still stumbling through this and iterating as we go but it's better than nothing. We might go with a full json tagged message format but I don't know yet. Some of our messages are for humans and some are for machines on the other side of the API.
Another (ugly yet effective I think) way would be using some shared context object which would be updated when needed (e.g. before calling the log) and read by the formatter without the need to parse strings.
I thought about that, but it seemed a bit messy, certainly not great ergonomics if I need two function calls to log something. Maybe the best solution is to wrap spdlog and not call it directly at all. Something to think about. Thanks.
but at least it's structured and can follow a rational format.
I guess an issue I have with encoding the data in the message with each log call (and having something else then interpret it) is that its easy to make a mistake and accidentally log in a slightly different format. I suppose some discipline and code review can help, but its not ideal. Its likely the best bet right now though without changing the spdlog code. Thanks for the RavenDB idea, I'll think on it.
Some of our messages are for humans and some are for machines on the other side of the API.
Indeed. What I tend to do (in other languages) is always try to format my logs to be machine readable, generally an "event" name plus a bunch of metadata, but to also always include a "text" or "message" key in the data for humans to read. Its somewhat redundant, but it means that its useful for humans too. I tend to not actually need this (and work mostly off the event names) but its handy at a glance and is handy for others who may not be familiar with the event names yet.
In case anyone else is searching for this, here's what I'm using. I wrote a custom formatter to help, that looks like this (requires C++17):
namespace {
template<class... Ts> struct overloaded : Ts... { using Ts::operator()...; };
template<class... Ts> overloaded(Ts...) -> overloaded<Ts...>;
}
struct SimpleJSON
{
using json_val = std::variant<std::int64_t, int, double, std::string, bool>;
std::unordered_map<std::string, json_val> members;
SimpleJSON(std::initializer_list<std::pair<const std::string, json_val>> il) : members{il} {}
template<typename OStream>
friend OStream &operator<<(OStream &os, const SimpleJSON &j)
{
for (const auto &kv : j.members) {
os << ", " << std::quoted(kv.first) << ":";
std::visit(overloaded {
[&](std::int64_t arg) { os << arg; },
[&](int arg) { os << arg; },
[&](double arg) { os << arg; },
[&](const std::string& arg) { os << std::quoted(arg); },
[&](bool arg) { os << (arg ? "true" : "false"); }
}, kv.second);
}
return os;
}
};
Then the usage looks like this:
using J = SimpleJSON;
spdlog::set_pattern(
"{\"timestamp\":\"%Y-%m-%dT%H:%M:%S.%e%z\",\"logger\":\"%n\",\"log_"
"level\":\"%l\",\"process_id\":%P,\"thread_id\":%t %v}");
spdlog::info("{}", J({{"key1","value1"},{"key2",true},{"key3",99}}));
Which results in single lines that look like:
{"timestamp":"2022-01-14T09:00:00Z","logger":"1","log_level":"info","process_id":123,"thread_id":123, "key1":"value1","key2":true,"key3":99}
It doesn't handle nested JSON, but with a bit more template magic, it probably could. My needs are just for various key/value pairs in JSON format, so this works for me. I'm happy about the pretty compact syntax on the actual log lines.
Note you can now also define file event callbacks to add the opening/closing braces at the start/end of the log file.
NIce one, that's really neat.
@danpat
That looked very useful to me so I decided to try it, but I'm having trouble getting it to build, I'm on ubuntu with gcc 9.3.0, what did you build with/on?
It's the line:
spdlog::info("{}", J({{"key1","value1"},{"key2":true},{"key3":99}}));
It's complaining about the parenthesis after J, and the subsequent colons.
I should give it a try with a newer compiler, but before getting around to setting that up I thought I'd check - thank you!
@IliasBergstrom Typo in my example. Replace the :
with ,
- I've updated the text I posted.
@danpat
Thanks, now it builds and runs! Weirdly however, it prints: "key3": 99, "key2": true, "key1": true (It treats the string as bool)
I tried a different solution which may be worth sharing here, and which allows complex JSON in the payload:
First:
spdlog::set_pattern("{\n \"log\": [");
(...)
// This initializes the log file with the opening brace and the "log" array as above.
_logger_instance->info("");
// We have some extra formatting on the log level %l below,
// to keep color coding when dumping json to the console,
// and we use a full ISO 8601 time/date format.
spdlog::set_pattern(
"{\"time\": \"%Y-%m-%dT%H:%M:%S.%e%z\", "
"\"name\": \"%n\", "
"\"level\": \"%^%l%$\", "
"\"process\": %P, "
"\"thread\": %t, "
"\"data\": %v}, ");
_logger_instance->info("{\"Status\":\"Started\"}");
(...)
... Then you can log as follows - here using rapidjson to create a string from a json structure: (This is elsewhere in the code)
std::string my_string_1("my_string_1");
const std::string my_string_2("my_string_2");
rapidjson::Document json_doc;
auto allocator = json_doc.GetAllocator();
json_doc.SetObject();
json_doc.AddMember("key_one", rapidjson::StringRef(my_string_1.c_str()), allocator);
json_doc.AddMember("key_two", rapidjson::StringRef(my_string_2.c_str()), allocator);
json_doc.AddMember("key_three", 2, allocator);
rapidjson::StringBuffer buffer;
buffer.Clear();
rapidjson::Writer<rapidjson::StringBuffer> writer(buffer);
json_doc.Accept(writer);
std::string as_string = buffer.GetString();
json_logger.info(as_string.c_str());
(...) And then, for the JSON to be valid, this is the last entry:
spdlog::set_pattern(
"{\"time\": \"%Y-%m-%dT%H:%M:%S.%e%z\", "
"\"name\": \"%n\", "
"\"level\": \"%^%l%$\", "
"\"process\": %P, "
"\"thread\": %t, "
"\"data\": %v} \n] \n}");
_logger_instance->info("{\"Status\":\"Finished\"}");
_logger_instance->flush();
This is not a all optimized, and the json string example is written in a brute-force manner, but you may find ways to make that work for you - it will work with any valid json string after all.
The result is this (but with different line breaks):
{
"log": [{
"time": "2022-01-19T10:16:41.598+01:00",
"name": "json_logger_0",
"level": "info",
"process": 50662,
"thread": 50662,
"data": {
"Status": "Started"
}
},
{
"time": "2022-01-19T10:16:41.598+01:00",
"name": "json_logger_0",
"level": "info",
"process": 50662,
"thread": 50662,
"data": {
"key_one": "my_string_1",
"key_two": "my_string_2",
"key_three": 2
}
},
{
"time": "2022-01-19T10:16:46.350+01:00",
"name": "json_logger_0",
"level": "info",
"process": 50662,
"thread": 50662,
"data": {
"Status": "Finished"
}
}
]
}
@IliasBergstrom You probably need to add const char *
as a type in the std::variant
and in the std::visit
call in my code. The compiler will try to fit the values you pass into the best matching field type in the std::variant
, and it's deciding to do bool
when you pass in a fixed string.
You could also try casting your string to std::string("blah")
, but that kind of defeats the purpose of the compact syntax.
Thanks for sharing this, there is a small issue if you try to log in the message
field a string that contains a character like \
. I encounter this issue when I was logging file paths in Windows that has the backslash character and it will fail ofc as it needs to be escaped when decoding the JSON. Does anyone has a workaround for this ?
@LordNoteworthy The line that does this:
[&](const std::string& arg) { os << std::quoted(arg); },
is only escaping "
characters - it should be extended to also escape \
. I think you can do this with:
[&](const std::string& arg) { os << std::quoted(std::quoted(arg, "\\")); },
The std::quoted()
function can be used to replace \
with \\
as well as "
with \"
. But I haven't tried it! Let me know if it works, and I can update what I wrote above.
@danpat I was using initially the code shared by @damienhocking. Now, that I also used your custom formatter, it actually works just by using one time the std:::quoted ! Thanks heaps.
@damienhocking You probably need to update your post by doing a JSON escape for the message
field, otherwise it won't be a valid JSON when the msg contains JSON special chars.
@LordNoteworthy Ah, you're right, I should've read the specs for std::quoted()
more closely:
Then every character from s, except if the next character to output equals delim or equals escape (as determined by the stream's traits_type::eq), then first appends an extra copy of escape
So it will escape both "
and \
on the input string by default. TIL. Very convenient.
With the custom formatter, a large fraction of my post is kinda out of date :-)
I'm thinking this needs a complete update and a replacement of the wiki article with this new code, plus the file event callbacks.
@damienhocking Please do. Json formatting with spdlog seems more popular lately.
I'll see how much time I can get over the Christmas break. Maybe this can be collaborative and a few of us can do a proper design. I don't know if that could be extended to the popular json libs like Lohmann, Boost or RapidJSON.
I don't know if that could be extended to the popular json libs like Lohmann, Boost or RapidJSON.
Of course not. It should be very basic and very minimal.
In case anyone else is searching for this, here's what I'm using. I wrote a custom formatter to help, that looks like this (requires C++17):
namespace { template<class... Ts> struct overloaded : Ts... { using Ts::operator()...; }; template<class... Ts> overloaded(Ts...) -> overloaded<Ts...>; } struct SimpleJSON { using json_val = std::variant<std::int64_t, int, double, std::string, bool>; std::unordered_map<std::string, json_val> members; SimpleJSON(std::initializer_list<std::pair<const std::string, json_val>> il) : members{il} {} template<typename OStream> friend OStream &operator<<(OStream &os, const SimpleJSON &j) { for (const auto &kv : j.members) { os << ", " << std::quoted(kv.first) << ":"; std::visit(overloaded { [&](std::int64_t arg) { os << arg; }, [&](int arg) { os << arg; }, [&](double arg) { os << arg; }, [&](const std::string& arg) { os << std::quoted(arg); }, [&](bool arg) { os << (arg ? "true" : "false"); } }, kv.second); } return os; } };
Then the usage looks like this:
using J = SimpleJSON; spdlog::set_pattern( "{\"timestamp\":\"%Y-%m-%dT%H:%M:%S.%e%z\",\"logger\":\"%n\",\"log_" "level\":\"%l\",\"process_id\":%P,\"thread_id\":%t %v}"); spdlog::info("{}", J({{"key1","value1"},{"key2",true},{"key3",99}}));
Which results in single lines that look like:
{"timestamp":"2022-01-14T09:00:00Z","logger":"1","log_level":"info","process_id":123,"thread_id":123, "key1":"value1","key2":true,"key3":99}
It doesn't handle nested JSON, but with a bit more template magic, it probably could. My needs are just for various key/value pairs in JSON format, so this works for me. I'm happy about the pretty compact syntax on the actual log lines.
in newer versions, make sure to include:
#include "spdlog/fmt/ostr.h"
otherwise there'll be fmt compilation errors due to requiring a formatter
In case anyone else is searching for this, here's what I'm using. I wrote a custom formatter to help, that looks like this (requires C++17):
namespace { template<class... Ts> struct overloaded : Ts... { using Ts::operator()...; }; template<class... Ts> overloaded(Ts...) -> overloaded<Ts...>; } struct SimpleJSON { using json_val = std::variant<std::int64_t, int, double, std::string, bool>; std::unordered_map<std::string, json_val> members; SimpleJSON(std::initializer_list<std::pair<const std::string, json_val>> il) : members{il} {} template<typename OStream> friend OStream &operator<<(OStream &os, const SimpleJSON &j) { for (const auto &kv : j.members) { os << ", " << std::quoted(kv.first) << ":"; std::visit(overloaded { [&](std::int64_t arg) { os << arg; }, [&](int arg) { os << arg; }, [&](double arg) { os << arg; }, [&](const std::string& arg) { os << std::quoted(arg); }, [&](bool arg) { os << (arg ? "true" : "false"); } }, kv.second); } return os; } };
Then the usage looks like this:
using J = SimpleJSON; spdlog::set_pattern( "{\"timestamp\":\"%Y-%m-%dT%H:%M:%S.%e%z\",\"logger\":\"%n\",\"log_" "level\":\"%l\",\"process_id\":%P,\"thread_id\":%t %v}"); spdlog::info("{}", J({{"key1","value1"},{"key2",true},{"key3",99}}));
Which results in single lines that look like:
{"timestamp":"2022-01-14T09:00:00Z","logger":"1","log_level":"info","process_id":123,"thread_id":123, "key1":"value1","key2":true,"key3":99}
It doesn't handle nested JSON, but with a bit more template magic, it probably could. My needs are just for various key/value pairs in JSON format, so this works for me. I'm happy about the pretty compact syntax on the actual log lines.
in newer versions, make sure to include:
#include "spdlog/fmt/ostr.h"
otherwise there'll be fmt compilation errors due to requiring a formatter
@jeffreytai I'm still getting a fmt compilation errors due to requiring a formatter
My solution is to just define a basic json template using a common json library that contain the spdlog format specifiers:
nlohmann::json _log_template = {
{"timestamp", "%Y-%m-%dT%H:%M:%S.%e%z"},
{"logger", "%n"},
{"log_level", "%l"},
{"process_id", "%P"},
{"thread_id", "%t"},
{"message", ""}
};
Then accumulate keyvalue pairs into another json for each log record and patch it with the log template.
// Combine the log message with the template
_json_log.merge_patch(*_log_template_ptr);
Now I can just set the pattern to this and print nothing (log_ptr being a pointer to the spdlog instance):
std::string pattern = _json_log.dump();
_log_ptr->set_pattern(std::move(pattern));
_log_ptr->info("");
I haven't seen any json examples anywhere with spdlog so I thought I'd share how we did it. It's not hard to have human-readable and json-based machine-readable logs. I've edited this as per @gabime's response below, you don't need to use custom formatters, you can just call set_pattern(...) directly as below.
Then, log whatever you like as normal, for example:
This will give you a log entry structured like this, although it will all be on one line:
You have to make sure yourself whatever you put in your log messages is valid json, you can make those as complex as you need with a complete json object if necessary. Most C++ json libraries have the ability to dump out a std::string that can be parsed as json and can then be passed as an argument to spdlog, or you can just do plain text messages like this example.
When you're finished logging, you have to close out the "log" array. We also drop the log to clean it up ourselves:
You end up with a log file that looks something like this (our setup and drop is done on a different thread from actual work, hence the different thread ids) :
You have to watch what minimum log level you have enabled in your compile, if you've disabled "info" then you'll need to use a higher severity to set up the initial json and close it out at the end. We're also going to develop a custom sink based on the new Boost::JSON so we can have a fast, in-memory JSON log as well for real-time processing.