Open nisanharamati opened 6 years ago
cc @rachelreese
I second this general idea, with the caveat that I too don't understand the performance implications. In any case, a more structured logging story would help immensely with operations. In addition to node-local time, the given worker's OS PID would be nice to know as well.
The number of calls for output is relatively low. Adding a primitive that formats the methods and adds a timestamp seems reasonable.
Other option would be to start using a logger like the one from the pony standard library (that I wrote) which is an actor (same as env.out, env.err) which gets sent messages that are asynchronously logged.
For that to be effective, if it had log levels, we would want it to operate in a fashion similar to how logger works in order to avoid extra messages sends (see logger documentation).
For now, starting with a primitive that wraps @printf
and does some formatting seems very reasonable.
Here's what I've done in a branch where I was doing tests and needed timestamps in the logs:
diff --git a/lib/wallaroo_labs/printts/printts.pony b/lib/wallaroo_labs/printts/printts.pony
new file mode 100644
index 00000000..6b74382b
--- /dev/null
+++ b/lib/wallaroo_labs/printts/printts.pony
@@ -0,0 +1,28 @@
+/*
+
+Copyright 2017 The Wallaroo Authors.
+
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+ implied. See the License for the specific language governing
+ permissions and limitations under the License.
+
+*/
+
+use "time"
+
+primitive Printts
+ fun apply(msg: String) =>
+ // get timestamp
+ (let s, let ns) = Time.now()
+ let us = ns / 1000
+ let ts = PosixDate(s, ns).format("%Y-%m-%d %H:%M:%S." + us.string())
+ // print it with @printf[I32]
+ @printf[I32]("%s %s\n".cstring(), ts.cstring(), msg.cstring())
Then in each code file where there is a @printf[I32](...)
, replace it with:
use "wallaroo_labs/printts"
...
Printts("my log text")
Note that while @printf[I32]
takes a cstring
, Printts
takes a regular String
, so you do need to change the arguments a little.
A temporary hack that's useful when debugging is to swap printf
using LD_PRELOAD
. It works on linux, and I have no idea if works on Mac. It probably does not work on Windows.
Create the following c program, named tsprintf.c
#ifndef _GNU_SOURCE
#define _GNU_SOURCE
#endif
#include <stdio.h>
#include <dlfcn.h>
#include <time.h>
#include <stdarg.h>
#include <stdlib.h>
static int (*orig_printf)(const char *format, ...) = NULL;
int printf(const char *format, ...)
{
if (orig_printf == NULL)
{
orig_printf = (int (*)(const char *format, ...))dlsym(RTLD_NEXT, "printf");
}
time_t timer;
char buffer[26];
struct tm* tm_info;
time(&timer);
tm_info = localtime(&timer);
strftime(buffer, 26, "%Y-%m-%d %H:%M:%S :: ", tm_info);
orig_printf("%s", buffer);
int result;
va_list args;
va_start(args, format);
result = vprintf(format, args);
va_end(args);
return result;
}
compile it into a shared lib with
gcc -shared tsprintf.c -o tsprintf.so -ldl -fPIC
Then run your application with LD_PRELOAD
:
LD_PRELOAD="./tsprintf.so" ./my_wallaroo_app
@nisanharamati that's too clever for my blood.
Catching up on old threads. I'm not sure what Nisan's strategy is today for timestamps, sorry. For completeness, this below is the OS X/macOS way of doing the same:
DYLD_INSERT_LIBRARIES=./tsprintf.so DYLD_FORCE_FLAT_NAMESPACE=1 ./my_wallaroo_app
@slfritchie I think my preferred approach is https://github.com/WallarooLabs/wallaroo/issues/2307#issuecomment-410285279
except maybe include the use "wallaroo_labs/printts"
in lib/wallaroo/core
so it's available by default?
It doesn't address the issue where @printf
is variable arity, but I leave that detail to the implementer!
It can be very difficult to piece together a reasonably estimate of a history of events from the current workers logs because none of them have timestamps.
So for that end, I think we should add timestamps to the logs.
This has some cons:
@printf
like we have been until now. We need to either include a timestamp in every@printf
call (laborious and error prone) or create a function that does it for us. Maybe we could make that function a C-FFI call like@printfts
that we can replace in-place without modifying any other code or requiring every module touse "custom_printf"
or some variation of that.The pros, however, I think are worth it: