Closed fjetter closed 1 year ago
FWIW I'm using the below regex to process my logs (don't judge me, It's awful, I'm not a regex person)
Search (\d{4}-\d{2}-\d{2}T((\d{2}:?)+))\s\w+\s+\d+\s(\d{2}:?)+\s([\w\-\d]+)\s([\w\-\d\[\]]+):\s(\d{4}-\d{2}-\d{2}\s(\d{2}:?)+,\d+\s-\s)?
Replace $2 $5\t-
Gives me a bit more concise log
Time IP - distributed_msg
22:17:09 ip-10-0-5-46 - distributed.scheduler - INFO - Remove worker <WorkerState 'tls://10.0.13.147:43965', name: test_dataframe-86e8569e-worker-7862a1560f, status: closing, memory: 0, processing: 0>
@ntabris make the logs look nicer so i'm assuming you're happy @fjetter , reopen if not
The currently configured log format is incredibly difficult to read
This is an example for a log message as retrieved using the coiled clusters log
This is something like
TS instance_name Date+TS process_name(??): <dask.distributed_log_format>
There are 190 characters before the actual message starts. Some of this is surely valuable information but it should not take 190 chars. If the
--short
command is provided this drops to 123 chars for the same log message which looks likeThe difference between short and long is only the leading timestamp formatting and the missing instance name.
We receive timestamps three times in different formats. This actually looks like a log message is emitted, formatted, ingested by another logger, formatted again, etc. Three times with three different log formats
Typically, when reading logs, I'm typically writing them to a file and semi-manually remove the leading formatter foo first before I can work with the logs.
As a Coiled user I would like to have the possibility to specify a log format and have a much less verbose default.