bensadeh / tailspin

🌀 A log file highlighter
MIT License
5.88k stars 96 forks source link

Custom regex patterns? #105

Closed bensadeh closed 9 months ago

bensadeh commented 9 months ago

Discussed in https://github.com/bensadeh/tailspin/discussions/102

Originally posted by **seqizz** December 29, 2023 Hi :wave: First of all, cool project. Tried on few places and liking it. A small question is: Do you plan to support custom patterns? Something like: ``` [our_special_fqdn] segment = { fg = "cyan", italic = true } separator = { fg = "red" } regexp = ' [a-z0-9-]+.[a-z0-9]+\.business\.local ' ``` Here is an example I see on latest release: ![image](https://github.com/bensadeh/tailspin/assets/307899/7e5a1724-5f25-45da-b388-a1ce5247e125) As far as I understand, this is the "process" but can't catch `udev-worker` due to parentheses. Considering weird logs we see every day, I thought this might be a nice addition for ones who'd like to lose some sanity via regex and gain amazing customization powers.
bensadeh commented 9 months ago

Hi @seqizz and thank you for starting this discussion. Glad you think this project is cool. Let's break down the question into two parts:

(udev-worker)[32410] is not properly highlighted

I think we can classify this as a bug and add support for parenthesis so that this entry is highlighted properly. I don't think that this would interfere with other highlighters, so it should be safe to push a fix for this specific example.

Add support for custom regex patterns

This one is a larger topic. Although I never written this explicitly, the philosophy behind tailspin is in some ways to be anti-regex. The idea was this: we humans know what a date looks like (or number, or keyword etc.), so let's have a program do the heavy lifting and find those entries for us. As a user, I would have the power to customize the highlighting, and the regex would never be exposed.

So far this philosophy has been working, and we have managed to identify many different log entries that (as far as I am aware) do not interfere with each other.

Having said that, I am not against adding support for regexp to tailspin at some point in the future, but the focus in the short to medium term is not opening up to custom regexp at this point in time.

What use case would you use a custom regexp for? Highlighting domains containing business.local or something else? Maybe there is another highlight group that should be added for your use case?

seqizz commented 9 months ago

Thanks for the detailed reply.

I understand the "it just works" idea, which is neat.

I was suggesting it to make tailspin extensible for all use cases, by recognizing and applying any regex inside custom keywords (without touching the current clever-catching and without exposing any regexp by default).

E.g. someone can say "oh I'd like to see this specific whatever highlighted". Which could be anything. A small & not-so-clever example:

...
Jan 02 21:14:54 innodellix systemd[1]: Started Accounts Service.
Jan 02 21:14:54 innodellix systemd[1]: Started Rule-based Manager for Device Events and Files.
Jan 02 21:14:54 innodellix systemd[1]: libvirtd-config.service: Deactivated successfully.
Jan 02 21:14:54 innodellix systemd[1]: Finished Libvirt Virtual Machine Management Daemon - configuration.
Jan 02 21:14:54 innodellix systemd[1]: suid-sgid-wrappers.service: Deactivated successfully.
Jan 02 21:14:54 innodellix systemd[1]: Finished Create SUID/SGID Wrappers.
Jan 02 21:14:54 innodellix systemd[1]: Started Cleanup of Snapper Snapshots.
...

If one would like to highlight all the service descriptions which has started, something like this can be used:

[[keywords]]
words = [ '/Started (.*)\./' ]
# it might be possible to catch, since it starts/ends with slash, but you can say a separated optional attribute like `regexes` would be more clear
style = { bg = "green" }

(hell, now while thinking it'd be even better with command line arguments like --words-green '/Started (.*)\./',another_word )

While I accept this looks nasty, it has the potential to be crazy helpful, (imho) especially for regexp-monkeys like me.

In any case, thanks for the tool! :)

hzjc commented 9 months ago

After Windows 11 installation, run exe directly?

bensadeh commented 9 months ago

I created a regexp highlighter on the regexp-highlighter branch.

The format is like so:

[[regexps]]
regular_expression = 'Started (.*)\.'
style = { fg = "red" }

Where you can add as many regexps entries as you'd like. You can run and test it out using the example logs and an example config with this oneliner:

cargo run -- example-logs/example1 --config-path config.toml

Let me know what you think.

seqizz commented 9 months ago

Woah thanks for the quick work (also for flake, didn't realize the local setup would be that fast :rocket: )

Works nicely, just one gimmick of current state: Parentheses imply match groups, so I was expecting to see only part of the line as highlight result (example). So if there are match groups detected, we might want to apply to only those.

bensadeh commented 9 months ago

Thanks for clarifying the part about the capturing groups.

I added support for only highlighting what's inside of the highlight group in 25280bd50bce0b9fcc698ba4e404d0109372a09f. Let me know if it is working as intended.

seqizz commented 9 months ago

Yep, now complete. Thanks for quick action :rocket:

bensadeh commented 9 months ago

Great! Thanks for confirming. Will be available in the next release.