tstack / lnav

Log file navigator
http://lnav.org
BSD 2-Clause "Simplified" License
7.85k stars 313 forks source link

Ability to force log format? #454

Open einnjo opened 7 years ago

einnjo commented 7 years ago

I'm having trouble getting lnav into the correct log format.

I made a custom json log format and it works great when all the lines in the log are json.

Problem is that I'm piping log events from an external source (CloudWatch logs) and sometimes a line is malformed which breaks log format detection.

tstack commented 7 years ago

Do you have an example of a malformed line? We should probably try to ignore it instead of having it break detection.

einnjo commented 7 years ago

I'll copy it if it happens again. But basically it was an incomplete json string.

E.g.

evel": 30, "tags": ["api", "prod", "absh123"]}

Also, sometimes an uncaught error skips the app's json logger and writes a stack trace to the output as is, this also breaks the log format.

einnjo commented 7 years ago

Heres another example, the first few lines of a log stream will also contain app initialization output. image

cattz commented 7 years ago

I'm having issues trying to vie log4j (Atlassian) log files. Tried defining my own format with examples form the real logs, but still failing. Also, the -d option mentiond in the last section here doesn't seem to exist anyymore. lnav-0.8.2

tstack commented 6 years ago

@cattz The '-d' should still work, it hasn't been removed. Can you give some sample log files, I can try to help in defining the format.

dset0x commented 5 years ago

While being able to skip malformed lines would likely solve this specific problem, being able to force a specific format is useful, eg. if the format changes mid-file and the user only cares about the newest format.

tstack commented 5 years ago

if the format changes mid-file and the user only cares about the newest format.

The latest version (v0.8.5), now supports multiple regex patterns per file. In other words, if a format has multiple regex patterns in a file, lnav should match one pattern and then switch to the other when the message format changes.

dset0x commented 5 years ago

@tstack That's great. I'll have to try it out.

I appreciate you sticking to the intended capabilities and features.

itrop commented 4 years ago

It would be nice to have a command line parameter forceing the desired log format. In my case I have lnav in the 0.8.1 version. The default format for log I'd like to check is "generic_log". I can't see any configuration of it in the .lnav/formats/default/default-formats.json.sample file. Is there any "workaround" for it?

macintacos commented 4 years ago

I would also like the ability to force a specific log that I'm looking at to be "generic" somehow; is there any way to do that? Can't figure it out from the documentation.

tstack commented 4 years ago

I would also like the ability to force a specific log that I'm looking at to be "generic" somehow; is there any way to do that? Can't figure it out from the documentation.

@macintacos Sorry, I'm not clear on what you mean here. Is the file you're looking at detected as another format or plain text? Can you provide a sample of the lines from the file you're interested in?

macintacos commented 4 years ago

@tstack what I think past-me meant was; I'm looking at a log file with a certain parser applied (lets say it's a user defined one) and I'd like to switch to use a "generic" log parser while I'm looking at a particular log file. This might be because maybe I got a portion of my parser wrong and now some log lines I'm looking at don't look quite right, so I'd just like to default to the generic view for a time. As far as I know, there's no way to "force" lnav to parse a file with a different parser while you're in an lnav session.

I know about lo-fi mode, but that's not quite what I want. I want all the same functionality of lnav, just with the ability to switch to a parser on-demand (if possible).

tstack commented 4 years ago

@macintacos Thanks for getting back

I'm looking at a log file with a certain parser applied (lets say it's a user defined one) and I'd like to switch to use a "generic" log parser while I'm looking at a particular log file. This might be because maybe I got a portion of my parser wrong and now some log lines I'm looking at don't look quite right, so I'd just like to default to the generic view for a time.

Previous versions required a single pattern in the log format to match all messages, but that has now been fixed and the other patterns in the log format will now be tried. That should improve things a bit. As for falling back to the generic_log, I'm not sure how that would work. I would kinda hope the format would be fixed to match all the lines. It might be better to make it easier to get the format fixed up rather than trying to fall back to the generic log.

pkoziol commented 2 years ago

My feedback on this: yeah, multiple patterns are nice, but sometimes I don't feel like I can write meaningful patterns.

For example: Wildfly likes to log all system properties the start which of course takes multiple lines:

2022-01-13 16:45:15,317 DEBUG [org.jboss.as.config] (MSC service thread 1-7) Configured system properties:
    QUARTZ_THREAD_COUNT_CPU_MEMORY = 2
    QUARTZ_THREAD_COUNT_NETWORK_IO = 16
    JWT_MAXFUTUREVALIDITYINMINUTES = 120
    [Standalone] = 
    awt.toolkit = sun.awt.X11.XToolkit
    excludeAccessLogging = /rest/messages.*
    file.encoding = utf-8
    file.encoding.pkg = sun.io
    file.separator = /

or some EJB modules:

2022-01-13 16:45:30,724 INFO  [] [] [org.jboss.as.ejb3.deployment] (MSC service thread 1-1) WFLYEJB0473: JNDI bindings for session bean named 'UserRestWebServiceImpl' in deployment unit 'subdeployment "com.company.webservice.api.server-1.0.0-SNAPSHOT.jar" of deployment "App.ear"' are as follows:

    java:global/App/com.company.webservice.api.server-1.0.0-SNAPSHOT/UserRestWebServiceImpl!com.company.webservice.api.users.UserRestWebService
    java:app/com.company.webservice.api.server-1.0.0-SNAPSHOT/UserRestWebServiceImpl!com.company.webservice.api.users.UserRestWebService
    java:module/UserRestWebServiceImpl!com.company.webservice.api.users.UserRestWebService
    java:global/App/com.company.webservice.api.server-1.0.0-SNAPSHOT/UserRestWebServiceImpl
    java:app/com.company.webservice.api.server-1.0.0-SNAPSHOT/UserRestWebServiceImpl
    java:module/UserRestWebServiceImpl

Which probably wouldn't be a problem if logs didn't start with these lines not matching any pattern. And these messages are at the beginning of the file, because server logs them it starts...

malnoxon commented 2 years ago

I have a case where using either forcing log format or fallback to generic_log would be useful and as far as I can tell can't be covered with the multiple regex support.

My logs are mostly json and I have a working config for those but some of the lines are non-json and mostly appear towards the top of my log files for application startup type stuff. Since the config I have is for json logs, I can't use multiple regexes to address it as per https://docs.lnav.org/en/latest/formats.html files with json logs shouldn't specify it.

Forcing my format would work for my use case as I very rarely care about the startup messages, ideal state would be for those lines to fall back to a generic format but use my config for the json lines.

tstack commented 2 years ago

My logs are mostly json and I have a working config for those but some of the lines are non-json and mostly appear towards the top of my log files for application startup type stuff. Since the config I have is for json logs, I can't use multiple regexes to address it as per https://docs.lnav.org/en/latest/formats.html files with json logs shouldn't specify it.

Do the non-JSON lines have timestamps? Seems like we could allow patterns to be specified in a JSON format to match the non-JSON lines.

malnoxon commented 2 years ago

My logs are mostly json and I have a working config for those but some of the lines are non-json and mostly appear towards the top of my log files for application startup type stuff. Since the config I have is for json logs, I can't use multiple regexes to address it as per https://docs.lnav.org/en/latest/formats.html files with json logs shouldn't specify it.

Do the non-JSON lines have timestamps? Seems like we could allow patterns to be specified in a JSON format to match the non-JSON lines.

Most of them do but not all of them unfortunately

(My current solution to this is to wrap my logs with ~50 valid json log lines at the top of the file before starting lnav which successfully gets lnav to treat all lines as json but does mean I have extra lines and the non-json lines are hard to decipher between the json parsing errors)

tektrip-biggles commented 2 years ago

I'd also like to request the ability to change/override the format of a particular log file manually. I've added my custom regex and it was working pretty well initially, but if for some reason it doesn't detect the format correctly (e.g. for some reason that day's logs have a larger proportion of "poorly formatted" line without timestamps or something), then I'd like to just quickly override the auto detection with a single command (or via a flag param) and keep working rather than have to stop to debug my regex and/or google why the auto-detection didn't work and so forth...

It's great to have automation and auto-detection when it saves time, but if that doesn't work for any reason, it feels like there should always be a manual override if possible...

tstack commented 2 years ago

I'd also like to request the ability to change/override the format of a particular log file manually. [...] I'd like to just quickly override the auto detection with a single command (or via a flag param) and keep working

I'm sorry, but I'm not getting what you're asking for here. If the regex doesn't match, lnav won't be able to understand the log message. It won't be able to extract the timestamp, log level, etc... in order to do indexing and what not. What are you expecting to happen when doing this override?

debug my regex

Do you make use of the sample feature of the log format definition? If the regexes in the format do not match a sample there will be some detailed error messages printed out on startup. I suppose one improvement would be to print out a link to something like regex101.com to help streamline the debugging process.

tektrip-biggles commented 2 years ago

What are you expecting to happen when doing this override?

I'd expect it not to extract any information from non-matching lines of course, but there are cases where the log file may not have any matching lines yet (e.g. if there is a lot of startup output without timestamps etc) but I know that once the log gets going properly it will match. Furthermore, I'd expect to be able to force lnav not to load a particular file in "text" mode when I want to set it to view multiple log files "intertwined".

Since the log format is detected at startup (I think?), you have to restart once there are matching lines etc which could be a few minutes into your workflow, even though you know right from the start that the file is always going to end up with the majority of lines being in a particular format eventually.

To illustrate via the case I had today, there was an issue that I'd been matching part of the line as [123] without realising that e.g. [ 1] was also possible in UE4's log output (ie d+ instead of [0-9 ]+), especially first thing after booting up the editor. That's a super trivial regex blunder and I did end up fixing it after realising the issue, but I would have preferred not to have spent 30 mins or so distracted from more important tasks by such things and would have preferred to just override the autodetection and leave the "bad" lines unformatted etc.

(Apologies if my tone comes across as being demanding, I really appreciate the work done on lnav! Even though I've only been using it for a week or so, I find it really useful. Hope this kind of feedback is helpful in understanding why such a "log format override" feature would be preferable in certain situations.)

piotr-dobrogost commented 2 years ago

Just thinking out loud. For the purpose of proper placement of such non-matched log lines among matched ones they could get the timestamp of the last seen matched line or some special timestamp START_OF_UNIVERSE if the there were no matched lines before (the case where the log starts with non-matched lines).

tstack commented 2 years ago

Since the log format is detected at startup (I think?), you have to restart once there are matching lines etc which could be a few minutes into your workflow, even though you know right from the start that the file is always going to end up with the majority of lines being in a particular format eventually.

The format is detected as lines are read from the file. As new lines are added, they will be tested against the formats until a match is found or the limit (15,000 lines in recent versions) is reached. So, you shouldn't have to do a restart. I tried this out and it seems to work, so I'm not sure what is going on in your case. (There was a bit of a bug where lnav was not automatically switching from the TEXT view to the LOG view when the format was detected. I've fixed that in the top-of-tree, but you still could press q to pop the TEXT view and go back to the LOG view manually.)

Here's a capture where I'm appending some text to a file and then when I append a log message it switches to the LOG view:

2022-04-04 23 23 59

If your log is not being recognized until after a restart, then that seems like a bug. If you can replicate it, you can capture a debug log (pass -d /path/to/log on the command line) and I might be able to figure out what is happening.

Furthermore, I'd expect to be able to force lnav not to load a particular file in "text" mode when I want to set it to view multiple log files "intertwined".

I'm not sure how this intertwining would work without the lines having timestamps. If you mean having lines mixed together as they are read from the file, I'm not fond of that since it's not reproducible when you exit and restart since the lines would no longer be mixed.

For the purpose of proper placement of such non-matched log lines among matched ones they could get the timestamp of the last seen matched line

Yes, that is how things work already. Unrecognized lines are treated as a continuation of the previously recognized line.

or some special timestamp START_OF_UNIVERSE if the there were no matched lines before (the case where the log starts with non-matched lines).

Lines without timestamps at the start of a log are treated as messages with timestamps that are the same as the first recognized line.

tstack commented 2 years ago

I suppose one improvement would be to print out a link to something like regex101.com to help streamline the debugging process.

I have pushed an integration with regex101.com to the top of the tree and made a small writeup here:

https://lnav.org/2022/05/01/regex101-integration.html

I realize this isn't quite what y'all are asking for, but I'm hoping it can improve things anyhow.