rsyslog / rsyslog-doc

documentation for the rsyslog project
Other
99 stars 249 forks source link

documentation is hard to use and badly structured #394

Open grinapo opened 6 years ago

grinapo commented 6 years ago

I sincerely apologise for the seemingly negative attitude of mine beforehand.

I have been suffering the rsyslog documentation for years now. I tried to be calm and positive, hoping that it'll get better as the program gets wider use, but everytime when I have to go back to rsylog I realise how painful it is.

My main problem is structure and clarity. The whole documentation makes a complex dance around various formats the config file may use, scattered everywhere all kind of syntax, often mixed. The config base structure is never mentioned, and the order of the elements in the TOC is seemingly random.

Let's change to constructive (without actually rewriting the docs since I only use rsyslog when I am forced to, mostly due to this problem). What do I have in mind?

Like first start with describing the alternative formats, clearly defining whether it is obsolete, or alternative, or preferred. This part should actually take real life in account, like I've never seen a distro using "rainserscript" despite that the doc suggest all else is obsolete. Except other parts say it's the opposite, as rainerscript is slow and undesirable. Or else.

Then describe the formats separately. Make all documentation self-contained and whole, so if I am not interested in, say, rainerscript, I do not have to browse through its documentatio nfor hints related to old syslog format.

Doing that you may start with structure: what is the structure of the config format? and I mean full structure. Like there is no description for old-syslog format like:

"The general format is: [...] filter [...] filter [...] [...] action & [...] action & [...] action [...]

where filter is: ....

and within the elements are .....

where action is: ..."

So right now it's almost impossible to see whether it's posisble to use more than one filters, and they're semicolon separated, or new line separated, or both, or neither; whether they are ANDed or ORed together; whether the filename element is compulsory or not, or it may be something else than filename (tilde, forgeign host etc); whether actions may be put on second line, or lines, and how and why.

And do not try to cram it into one page and one paraghraph. Show the structure, and link to separate pages describing the elements.

And this is only just one example how to describe the whole syntax clearly, and after that there could be the list of various examples.

The current documentation feels like a tutorial, which shows usage examples (and badly that) without specification and syntax. Intermixing the formats. And mixing all kinds of usage. Not considering real-life.

I can give you ideas, structure. I won't write documentation, sorry. I can check it and can offer insights, if I must. And you may close this issue if you feel you can't help it and there are no other volunteers to get it done.

rgerhards commented 6 years ago

@grinapo Thanks for the great post. I admit I am myself frustrated with the doc, but did not get it better together. Unfortunately, we have limited volunteers in any case.

I appreciate your try to move things towards the better side. I can't promise I will be able to actually make it happen. I don't know if someone else will. But, I would at least try to get something started. Real-User feedback, especially in regard to structure is very useful and has been rarely seen (besides "this is bad"). So if you can live with a slow process and provide feedback along the way, I'd try to do a bit every now and then. And maybe some others jump in as well (hope dies last).

What do you think?

rgerhards commented 6 years ago

@deoren would you be up to work on such an endavour together with me (and hopefully others)? It for sure would be very useful...

deoren commented 6 years ago

@rgerhards @deoren would you be up to work on such an endavour together with me (and hopefully others)? It for sure would be very useful...

I have an interest in doing so, but very limited on time right now. I hope to focus on some small doc updates here/there during upcoming holidays, but probably nothing big.

As others have said here and on other related issues (e.g. ,#68) a reworked Table of Contents is a good place to start. I've not yet reviewed prior discussions in detail, but it will help to know on what points you do not wish to bend.

For example, on #270 and #88 there had been expressed interest in removing legacy options from the documentation. I am not a fan of the legacy format for reasons already stated often in the existing documentation (error prone, harder to visually follow, etc) and so have an interest in working to purge that content except for cases where the new config format isn't supported (side note: it would be a huge benefit to have the docs clearly state exactly which modules, directives, etc).

That said, if you plan to continue supporting both legacy format and Rainerscript and wish for the docs to provide solid coverage of both where both are fully supported, that would be a good goal to make clear.

deoren commented 6 years ago

@grinapo Like first start with describing the alternative formats, clearly defining whether it is obsolete, or alternative, or preferred. This part should actually take real life in account, like I've never seen a distro using "rainserscript" despite that the doc suggest all else is obsolete. Except other parts say it's the opposite, as rainerscript is slow and undesirable. Or else.

I think you have some good points. As I went through the documentation (I read v8.28.0 docs start to finish) I noticed that it needs several reviews by an editor to catch some historical remarks that refer to various limitations (e.g., property case sensitivity) that may no longer apply (or are now optional via config parameters). That said, I write a lot of documentation for work and have a good appreciation for how much effort is required.

I am not particularly skilled at it, but even so I often find that even simple things are overlooked when you are close to the source material and know it so well. You simply don't notice the omission or lack of extra detail when it's already in your mind. It's a very easy mistake to make. Also, as other devs in other projects have noted, documentation isn't as interesting as implementing new features or fixing bugs. It is important, but doesn't have as much of an immediate impact on the project's health as good progress on fixing issues.

Re distros not supporting Rainerscript, I understand what you're saying. It is frustrating when you are just learning rsyslog and all you see if the old syntax. When you start diving into the docs and find that what you've been working with is all outdated it's a little discouraging. That said, look at Ubuntu's LTS releases for a reason why this may be the case.

Ubuntu 16.04 is the latest LTS release and came out in April 2016. It was the first LTS release to include v8.x rsyslog and AFAIK could fully support Rainerscript. At the time Ubuntu 12.04 still had 5.x rsyslog and Ubuntu 14.04 had 7.x rsyslog. Were I a lead Ubuntu dev I would probably opt to use a syntax in the /etc/rsyslog.conf and any /etc/rsyslog.d/*.conf files that would work across all rsyslog versions included in the then supported LTS releases that I had to support. I would then rely on links to upstream docs to help users learn of alternate options for configuring rsyslog. It's unfortunate, but it makes sense from their perspective.

Other distros may have been in the same boat with their latest stable releases (maybe RHEL and CentOS with distro versions v5, v6 and v7?). In short, distros likely play it safe in this regard.

@grinapo Then describe the formats separately. Make all documentation self-contained and whole, so if I am not interested in, say, rainerscript, I do not have to browse through its documentation for hints related to old syslog format.

I think this is a solid point.

@rgerhards I would personally advocate serious thought to splitting off all documentation related to the legacy format and maintaining it separately from the primary documentation. Sure, that is more work, but if the project goal is to reduce support for the legacy format over time it would mainly be grammar and clarification improvements made to both sets of docs, not big feature changes. For the most part the legacy docs would only be touched when those fixes are "backported" from the current docs.

@rgerhards If it's not already being done (haven't checked), I'd include an optional rsyslog.conf file with your upstream packages that uses all new configuration syntax as a drop-in replacement for what you're using by default to replace the distro-provided conf file. I understand that the included rsyslog.conf file has to be as close to a 1:1 drop-in replacement for the existing file in order to not disrupt user expectations, but having the alt conf file and a reference within the /etc/rsyslog.conf file for the user to find would be useful.

Heh, of course that is more work, but what is a healthy Open Source project without many requests from users for enhancements. ;)

rgerhards commented 6 years ago

@deoren That said, if you plan to continue supporting both legacy format and Rainerscript and wish for the docs to provide solid coverage of both where both are fully supported, that would be a good goal to make clear.

Legacy format is fine for very simple things, like writing to a file via traditional filters. Anything more complicated (like defining queues, rulesets, inputs) is horribly done in legacy format and should not be promoted. We should probably try to work better with distros to get that stuff removed, if that's actually an issue.

rgerhards commented 6 years ago

@grinapo Then describe the formats separately. Make all documentation self-contained and whole, so if I am not interested in, say, rainerscript, I do not have to browse through its documentation for hints related to old syslog format. @deoren I think this is a solid point.

Yes, but... complex things in the old format is really pain in the a.... It is extremly easy to get wrong and definitely not something we should see in a modern rsyslog.conf. Doing good documentation on that old cruft is probably even more effort than documenting the good stuff and it would even encourage people to fight with all the evil old stuff. We have covered the old stuff as an aid for folks who look things up. If that really is problematic, I would opt to totally remove the (complex, evil) old stuff and just refer to the old v5 doc (I think we had RainerScript since v6).

There is actually no need to "maintain" doc on the old cruft: we do not accept additions and modifications of legacy statements. It is supported, and continues to be, so that you can safely upgrade any rsyslog version to a new one. But functionality-wise, legacy config format is frozen. New features are only enabeled via RainerScript. So v5 is the definitive reference on the "masochist stuff";-).

@rgerhards If it's not already being done (haven't checked), I'd include an optional rsyslog.conf file with your upstream packages that uses all new configuration syntax as a drop-in replacement for what you're using by default to replace the distro-provided conf file. I understand that the included rsyslog.conf file has to be as close to a 1:1 drop-in replacement for the existing file in order to not disrupt user expectations, but having the alt conf file and a reference within the /etc/rsyslog.conf file for the user to find would be useful.

At least I had asked the packaging guys to change the rsyslog.conf we distribute to new style format. HOWEVER, it is not necessary (nor useful IMHO) to replace things like mail.* /var/log/mail.log by

if prifilt("mail.*") then
   action(type="omfile" file="/var/log/mail.log")
fi

but, on the other hand, it totally makese sense to use

if prifilt("mail.*") then
   action(type="omfile" file="/var/log/mail.log" template="mytemplate"
              fileCreateMode="006"  fileOwner="me" ...)
fi

instead of whatever horrible construct was used in legacy conf (I've actually forgotten how it was. really. honestly).

So, well, it might indeed make sense to use new style for everything to make it easier to extend. Is that what you mean?

rgerhards commented 6 years ago

You might find this blog posting an interesting read on why we created RainerScript: http://blog.gerhards.net/2011/07/rsyslog-633-config-format-improvements.html

rgerhards commented 6 years ago

We also have an interesting PR right here https://github.com/rsyslog/rsyslog-doc/pull/188.

Unfortunately, I have been totally ignorant about it (a loooot of shame on my), else this might have turned into something pretty fruitful. I just re-discovered when this thread was created. :-(

deoren commented 6 years ago

@rgerhards Sorry for the quick/incomplete response, but in short, I was recommending splitting off all coverage of old legacy content to a separate section. In essence, have two copies of the documentation:

The only updates to the Legacy docs would be grammar, typo and tweaks to content to clarify intention, things like that. All new content would go into the Current docs collection. Of course if there haven't been big changes to the Legacy support since v5, then the Current docs could simply have multiple references to, "See the v5 docs for coverage of the Legacy configuration syntax".

This would allow the v8+ docs to be cleaned up to remove all references to the old content. Perhaps I'm just repeating what you've already said?

grinapo commented 6 years ago

@rgerhards Yes, but... complex things in the old format is really pain in the a.... It is extremly easy to get wrong and definitely not something we should see in a modern rsyslog.conf.

The problem is twofold.

1) The "modern" documentation should not contain obsolete things, neither to describe them nor to use them in examples, and definitely not mixing in the text body. It really confuses the reader. It should contain an link to the old format documentation, or many links, but no inline text body or examples.

2) The "old format" doc should be kept while it actually works, and it may refer the new format. What I would really advise is that all references to rainserscript (be that description or example) should be extremely visibly marked as such. The user should be able to see what's done in the old script, and what is not possible and require "mixing" them; also it would be useful in such cases to specifically link the same example from the "modern" documentation/examples.

As to the other comments I believe I completely agree. Only thing I would like to emphasize is the clear and separate syntax definition of the structure of the config file. (Especially for the old format, while it lasts, but the new one requires it too.)

georgehank commented 2 years ago

The filters page in the current (v8 2000-something, in 2021) documentation has this: "However, we try to implement the scripting facility as soon as possible (also in respect to stage work needed)." I assume that this is outdated? I can only assume, based on that v8 supposedly(?) has full "RainerScript" :D, that this also includes full expression filters?

qbe commented 1 year ago

IMHO there is another, worse problem with the documentation: a lack of comprehensive lists for various items (for example expressions)

qbe commented 1 year ago

Also, rsyslog.com/docs displays full screen ads.

computerquip-work commented 12 months ago

The documentation is so poor that it's almost unusable. Using RainerScript is hands down the most painful thing I've ever used in software.

davidelang commented 12 months ago

Can you help us understand the problems that you ran into?

Rsyslog documentation is mostly maintained by volunteers, and the pages mostly written as one-off add-ons to go with new code that's written (and is mostly written by people who are very close to the code, which is always a problem). The docs are in a public git repo so that people can help us improve them. Saying "patches welcome" is not being passive aggressive, it's a cry for help.

You specifically are having problems with RainerScript, what problems are you having? RainerScript is actually very simple (and limited)

A lot of the time people having problems with it are just trying to do things that it's not able to do. It's a replacement for the old syslog syntax and some reverse polish notation conditional syntax that had evolved over time. The if..then..else nature is far simpler than what it replaced, but it's not, and isn't designed to be, a full featured programming language.

The other big problem people commonly have is mixing the pre-RainerScript syntax (which heavily relied on side-effects from prior statements) with RainerScript syntax (which was explicitly designed to not depend on such side-effects). To keep from breaking backwards compatibility, the old syntax continues to be supported.

Besides RainerScript clarification, what else do you suggest that we can do to improve the documentation?

David Lang

On Wed, 1 Nov 2023, computerquip-work wrote:

The documentation is so poor that it's almost unusable. Using RainerScript is hands down the most painful thing I've ever used in software.

davidelang commented 12 months ago

We have received complaints about rsyslog documentation repeatedly, We have a lot of detail, but it's all written for someone already fairly familiar with things.

Here is a 3am first pass from me at writing an overview of how rsyslog works, with the idea that this could be made pretty with diagrams, click through links to more specific pages with detail, etc.

I'm replying to the github issue to see if the user who complained about the documentation and RainerScript would find this more useful, but also to rsyslog-users to get feedback from others on this.

some of the sections here should possibly be broken into sub-pages (some sub-pages already exist that cover some of these and can/should be simplified), or it make make sense to have a simple version on an overview page with the ability to click down for the gory details.

David Lang

Rsyslog architecture is very straightforward, but in it's simplicity it hides a lot of flexibility.

Rsyslog has one or more inputs that each receive one or more messages and pass the batch of messages to a ruleset

Each input runs the incoming log through a stack of possible parser modules until it hits one that reports success in parsing the log (pointer to parser module documentation and the default stack)

Multiple inputs can feed to the same ruleset (by default, all inputs feed to the Default ruleset which uses the 'main' queue) [1]

Worker threads pull batches of logs from a queue, then process the logs in the batch using the statements in a ruleset

Conceptually, it really is that trivial. As always, looking at details makes it seem more complicated.

Rsyslog config file(s)

Rsyslog reads in the config file and all included files and combines them before evaluating anything (see -o option for how files are combined), which file a statement is in has no impact (other than as part of the ordering of statements). (insert link to Rainer's recent post on mis-use of config includes??)

At startup time, Rsyslog evaluates the combined config and implements all module loading, input definition, template definitions and other global settings.

All other statements get put into the default ruleset unless a ruleset is specified. None of these statements are evaluated (beyond syntax checking) at startup.

The Rsyslog team believes very strongly in maintaining backwards compatibility (a config that works should never break or change behavior when rsyslog is updated to a new version) as such there are multiple ways of doing the same thing, and some ways are no longer recommended. When you see that something is depriciated, that means it is recommended not to use it in a new configuration for confusion/feature reasons, not that it is scheduled to go away/break in a new version.

The config statments that existed prior to v6 of rsyslog were an evolution of the syslog format from the 90's, doing complex things by setting a bunch of values that then got used by later statements. By v5 of rsyslog, this was resulting in such complex interactions that even core developers were having trouble understanding what complex configs did. V6 introduced RainerScript, which deliberately requires you to specify all options rather than 'inheriting' them from prior statements. This can be significantly more verbose as it requires you to specify all options each time, but makes it much clearer exactly what is happening. There are times when the old syntax is shorter and more obvious to use than the new syntax, and in those cases, it's recommended to use the old syntax. But if the old syntax requires multiple lines to do something, you are probably better off using the new syntax.

Rulesets are the heart of log processing, defining what happens with each log message. The statements in a ruleset are evaluated for every log message as it is processed.

Rulesets and Actions can have a queue defined for them (insert link to queue turn lane post, possibnly with updates). The 'default' ruleset uses the 'main' queue.

The contents of a ruleset are a series of statements, which can be:

  1. call an action to use an output module 1a. legacy formats: /var/log/messages (write to a file) @1.2.3.4 (send to a remote system via UDP @@1.2.3.4 (send to a remote system via TCP) 1b. action() format
  2. set/clear variables (link to functions)
  3. call a message modification module (can modify the log message being processed and set variables) commonly used to parse messages
  4. call another ruleset and then retur
  5. statement block { statement statement } usually used after a filter to have the filter apply to multiple statements & apply the last filter to this statement [2]
  6. stop processing this log message 6a. ~ [2] 6b. stop ignore all following statements in the ruleset. Rsyslog will warn you if you have statements after an unconditional stop
  7. apply a filter 7a. legacy syslog facility.serverity filters i.e. mail.info /var/log/mail 7b. rsyslog property based filters [2] i.e. :msg, contains, "foo" 7c. expression based filters (if..then..else with continue) (link to functions and conditionals) i.d. if $msg contains("foo") then
  8. atomic stats update (see impstats module)
  9. foreach execute a block of statements on each value in a variable array

Variable types: built-in/legacy properties start with $ or $$ (link to property page) user modifyable variables exist as a tree internally represented as a json structure. There are three trees that can be used: "normal variables" start with $! "local variables" start with $. (exist so that you can include all $! variables in a template without including everything) "global variables" start with $\ (persist past the log message where they were set, performance pigs)

Templates are used by output modules. They are used to create larger strings that use variable values for use with the module. These allow you to change the format of the output, what file or database table the log gets written to and other similar things. The details of what the result of the template means varies from one module to another. There is a common misconception that a template can be used to match and parse a log message. Templates are output only.

[1] it may make sense to have a 'are you sure' message at startup about inputs that feed to rulesets that don't have a queue defined for the ruleset. I don't think a new warning would be a breaking change

[2] supported for backwards compatibility, use is discouraged

(note, there should possibly be two versions of this, one showing the straighforward, single-message-at-a-time process, and a second one that shows the advanced, batch supporting, version that includes showing where locking happens, the atomic stats and foreach would be in the advanced version)

On Wed, 1 Nov 2023, computerquip-work wrote:

The documentation is so poor that it's almost unusable. Using RainerScript is hands down the most painful thing I've ever used in software.

computerquip-work commented 12 months ago

This is a bit unorganized of a take so I'm going to apologize ahead of time. These are the things I could think of off the top of my head.

Documentation is unclear and doesn't take itself seriously.

What I mean by this is that it states things that you can't take at face value. For example, in your overview, you state that [2], or basic syntax, is discouraged but the documentation says this is the format best used to express basic things. People take that comment seriously and I've seen a lot of mixing and matching of both formats, where it then ends up as two things in the same configuration file that get expressed in two different ways. As a result, you can't just know RainerScript or basic syntax, you have to understand both if you want to read a configuration file. Even the sample configuration often used as a default doesn't use RainerScript, it uses the basic and/or legacy syntax. https://github.com/rsyslog/rsyslog/blob/master/sample.conf

You give a solid overview that matches how I view the basic vs RainerScript situation... but also, while RainerScript is more verbose, it's incredibly confusing to mix and match several syntax together. It is not clear to me at all what's "recommended" anymore and rsyslog (both as a community and a product) itself seems unclear on the topic.

Variables and their use are a mess.

I'm still not sure how to express variables in RainerScript. For examples that are used in the documentation:

Where am I supposed to look in the documentation to interpret these? There is some explanation here. But notice that it's not comprehensive. It doesn't mention all of the formats above at all. I'm basically on my own for anything not documented for the examples above. I've ended up using $. for most everything since I don't have any idea why I'd used $! and I still to this day have no clue what $$ means (the best I can figure is that the actual variable name is $now-unixtimestamp and it's just stuck like that).

RainerScript itself has multiple ways to express something but no clear guidance on which to choose.

Templates have several different ways to express themselves and it's not clear why you'd use one over the other. For the most part, I've just used the more expressive version with explicit constant, property, etc. in a list. There are a couple of instances where I couldn't figure out how to express that in a list (or it wasn't possible as far as I could tell from the documentation) so I did use string. I don't know why I'd use subtree.

A nit pick but text is displayed in a not-friendly manner.

Some parts of the online documentation requires you scroll over a ridiculous amount to actually read it: https://i.imgur.com/Ujl289L.png

The index is too empty.

Not sure what's up with the index but there's basically nothing in there. No reference to global(), input(), or various other keywords and terms that would be very useful. For example, if I want to see how the contains expression work, I'd imagine I could go to the index to find a page related to it.

There is no search function.

The search function for the site doesn't appear to pertain to the documentation unless I'm misunderstanding. If I want to search for the expression contains or global, there's no way to do so. Even if I search for something very specific such as RuleSetCreateMainQueue, I get no useful results.

The basic vs RainerScript conversion/mapping is wildly incomplete.

Why isn't stuff like $Ruleset listed on the page for conversion? There's a lot in RainerScript and basic that's 1:1, there should be documentation on how the two map together if they're to be used together.

A (maybe bad) example

For a practical example, let's say I see $Ruleset RSYSLOG_DefaultRuleset and I want to figure out what exactly that does. Where do I even begin? This looks like basic but if I look over in Legacy Configuration Directives, there's no mention of it. There's no mention of it on the conversion page. I see documentation for rulesets over in basic structure but still no mention of $Ruleset although it does mention RSYSLOG_DefaultRuleset. Search doesn't work so I can't do that. It's not listed in the index. At the bottom of the Table of Contents, there's a page named Multiple Rulesets in rsyslog where it lists what it does and what that particular ruleset means but I have to know to look there.

I think the example is on the ridiculous side because I think most people should be able to assume that $Ruleset just changes the current ruleset. But there are parts in the example that should have worked, such as search or index, that failed. $Ruleset is basic syntax but there's nowhere it's listed as such. If you apply this to other things you might find in an older configuration like $RuleSetCreateMainQueue, each time you have to search through the documentation is a different path in the maze to finally get to where you need to be.

computerquip-work commented 12 months ago

I'll look into patches. I will admit that it's hard to quantify what's wrong with it. There is a lot of the necessary documentation there but its organization is so messy that it becomes difficult to use. Tools that would help workaround that poor organization (like an index or search function) don't work which just compounds the problem. I think an argument could be made about how the documentation is indecisive about basic vs RainerScript but ultimately, I don't think it should matter if the documentation were organized in a way to where it's easy to reference and understand either.

davidelang commented 12 months ago

On Thu, 2 Nov 2023, computerquip-work wrote:

This is a bit unorganized of a take so I'm going to apologize ahead of time. These are the things I could think of off the top of my head.

  1. Documentation is unclear and doesn't take itself seriously.

What I mean by this is that it states things that you can't take at face value. For example, in your overview, you state that [2], or legacy syntax, is discouraged but the documentation says this is the format best used to express basic things. People take that comment seriously and I've seen a lot of mixing and matching of both formats, where it then ends up as two things in the same configuration file that get expressed in two different ways. As a result, you can't just know RainerScript or legacy syntax, you have to understand both if you want to read a configuration file. Even the sample configuration often used as a default doesn't use RainerScript, it uses the legacy syntax. https://github.com/rsyslog/rsyslog/blob/master/sample.conf

Yes, this is true. RainerScript is a recent addition because attempts to graft more functionality into the old syslog syntax got so ugly that even the rsyslog developers were having trouble reading configs and understanding what they do.

Initially there was talk about phasing out the old syntax, but to maintain backwards compatibility (avoid breaking existing configs) we decided to maintain support for both.

You give a solid overview that matches how I view the legacy vs RainerScript situation... but also, while RainerScript is more verbose, it's incredibly confusing to mix and match several syntax together. It is not clear to me at all what's "recommended" anymore and rsyslog (both as a community and a product) itself seems unclear on the topic.

we have to support both to avoid breaking existing configs, the recommendation is to use whichever is the clearest to the team maintining the config, but if you need to use multiple lines to configure something in the legacy format, you are probably better off using the new format.

  1. Variables and their use are a mess.

I'm still not sure how to express variables in RainerScript. For examples that are used in the documentation:

  • property(name="$!usr!msgnum")
  • @.***" value="1" format="jsonf")` (Actually isn't a variable at all)
  • set $!usr!tpl2!dataflow = field($msg, 58, 2);
  • property(name="$!")
  • set $.tnow = $$now-unixtimestamp

Where am I supposed to look in the documentation to interpret these? There is some explanation here. But notice that it's not comprehensive. It doesn't mention all of the formats above at all. I'm basically on my own for anything not documented for the examples above. I've ended up using $. for most everything since I don't have any idea why I'd used $! and I still to this day have no clue what $$ means (the best I can figure is that the actual variable name is $now-unixtimestamp and it's just stuck like that). There's no mention on scoping (or lack thereof), there's no real mention on how to set your own variables, only that you can do it.

  1. Templates are split into different formats. Similar to 1, templates have several different ways to express themselves and it's not clear why you'd use one over the other. For the most part, I've just used the more expressive version with explicit constant, property, etc. in a list. There are a couple of instances where I couldn't figure out how to express that in a list so I did use string.

These are both the legacy of how things were added to rsyslog (along with the implementation details), and can't be cleaned up without breaking backwards compatibility. Yes, in retrospect it's bad and ugly and should have been done differently back in the really early days, but we don't see a way to get out of it. I can give you an explaniation of what is and why it got this way, I'd appriciate any suggestions in how we can better document this (as I said before, the people who wrote the documentation are too close to the code)

initially there were 'message properties' such as timestamp and hostname. then system properties were added such as $myhostname https://www.rsyslog.com/doc/v7-stable/configuration/properties.html

these were referenced in templates as $template foo, "this uses a variable %timestamp% or %$myhostname%" when rainerscript was added, they were referenced as $timestamp and $$myhostname in an if statement.

RFC-5424 was written to standardize syslog formats better than the prior RFC-3164, and it included an ability to add structured data to log messages. Pretty much nobody used it. A few years later, the various logging projects got together to try and define a standard for structuring logs in messages. The only part of it that survived was the idea to encode messages as JSON in the body of the message, and then have the logging systems parse the messages with ! as a reserved character so: {'a': 'foo', 'b': {'c': 'bar', 'd':'baz'}} would let you use $! (returning "{'a': 'foo', 'b': {'c': 'bar', 'd':'baz'}") $!a (returning 'foo') $!b (returning "{'c': 'bar', 'd':'baz'}" $!b!c (returning 'bar') This is when user definable variables were added to rsyslog (initially just as the result of a message modification module parsing messages, but then the set/unset statements were added allowing manipulation of variables in the config)

I am responsible for us adding the $. namespace so that we could have a place to put variables that we don't want to include when we refer to $!, this is things like variables that you use for conditions, things you will use in file path templates, etc. Other than the fact that parsing message modification modules default to populating $!, there is no technical difference in how $! and $. variables can be used, they are simply two different namespaces (sometimes $. is referrred to as 'local' variables, reflecting the history of using it for internal processing while $! is historically used for things that will end up in an outbound message)

If you log a message using the RSYSLOG_DebugFormat you will see these variable namespaces down at the bottom of the message block.

$\ was added at the same time as $. so that there is a way to set a variable that will persist past the processing of a single message. These aren't used much, and the cost of locking around making them reasonably reliable to use makes them something to avoid if you can.

the simple template definition doesn't work well when complex escaping is needed, thigns needed to be formatted into json structures, etc and so new ways of defining a template were added. I'm not sure the new string format should have been added (it's just more syntactical suger around the old way of defining templates), but that was in the days when doing a break with the existing config format was being considered.

personally, I almost always use the legacy format for template definitions.

Not doing a break with the old config ended up being a significant advantage, it is what allowed the distros to switch from sysklogd (which wasn't being maintained) to rsyslog with minimal disruption. If we had made that change require the new syntax, I think odds are good that syslog-ng would have been selected and rsyslog may have faded away (syslog-ng has now gone the freemium route where you have to pay to get the full feature set)

the documentation for all of this was mostly written one page at a time as things changed, grafting the pages into the existing documentation

Now that I have given you the 'what is' and the history behind it, do you have suggestions for how we can update the documentation to better show and explain this? The docs tend to be a very dry reference material structure, but it may be that we need to give this history somewhere in there to explain the 'why' around this.

And if you can suggest changes that we can make to make things more consistant, please do (but keep in mind that for backwards compatibility, we aren't going to be able to remove support for the existing stuff)

  1. Text is displayed in a not-friendly manner. Some parts of the online documentation requires you scroll over a ridiculous amount to actually read it: https://i.imgur.com/Ujl289L.png

do you mean horizontal scrolling? we thought we ad fixed this

  1. The index is too empty.

Not sure what's up with the index but there's basically nothing in there. No reference to global(), input(), or various other keywords and terms that would be very useful. For example, if I want to see how the contains expression work, I'd imagine I could go to the index to find a page related to it.

good point, thanks I have been tripped up myself looking for global() a time or two

  1. There is no search function. The search function for the site doesn't appear to pertain to the documentation unless I'm misunderstanding. If I want to search for the expression contains or global, there's no way to do so. Even if I search for something very specific such as RuleSetCreateMainQueue, I get no useful results.

this is actually designed to be packaged and shipped with your distro. But I agree that it would be good to add a specific search the docs capability (I mostly use google and look for hits on rsyslog.com but I know enough of what I'm looking for to find it)

I think it would also be fantastic if it was possible to get sponsorship for the doc site and eliminate the advertising there (I don't know how much adiscon gets from those ads, so I don't know how much sponsorship money would be needed to eliminate them)

For a practical example, let's say I see $Ruleset RSYSLOG_DefaultRuleset and I want to figure out what exactly that does. Where do I even begin? This looks like legacy but if I look over in Legacy Configuration Directives, there's no mention of it. There's no mention of it on the conversion page. I see documentation for rulesets over in basic structure but still no mention of $Ruleset although it does mention RSYSLOG_DefaultRuleset. Search doesn't work so I can't do that. It's not listed in the index. At the bottom of the Table of Contents, there's a page named Multiple Rulesets in rsyslog where it lists what it does and what that particular ruleset means but I have to know to look there.

I think the example is on the ridiculous side because I think most people should be able to assume that $Ruleset just changes the current ruleset. But there are parts in the example that should have worked, such as search or index, that failed. $Ruleset is legacy syntax but there's nowhere it's listed as such. If you apply this to other things you might find in an older configuration like $RuleSetCreateMainQueue, each time you have to search through the documentation is a different path in the maze to finally get to where you need to be.

that's a good example, and it perfectly shows the problem we have. rulesets weren't initially in rsyslog, when they were added the concepts page was written to explain them, but the rest of the documenation wasn't significantly changed (other than to add the 'call' capability and the ability to tie a ruleset to an input), years later when the page on legacy statements was added, that one was missed.

Rainer, is there a relatively easy way to search the code for legacy type statements to make sure they are all documented on the legacy config page?

David Lang

davidelang commented 12 months ago

On Thu, 2 Nov 2023, computerquip-work wrote:

I'll look into patches. I will admit that it's hard to quantify what's wrong with it. There is a lot of the necessary documentation there but its organization is so messy that it becomes difficult to use. Tools that would help workaround that poor organization (like an index or search function) don't work which just compounds the problem. I think an argument could be made about how the documentation is indecisive about basic vs RainerScript but ultimately, I don't think it should matter if the documentation were organized in a way to where it's easy to reference and understand either.

no disagreement, as I have said, the people who wrote the docs are far too close to the code, and it was added piecemeal over a couple of decades now.

good callout on the need for a doc specific search functionality (I think when readthedocs inports the rsyslog docs, they make it searchable, but they tend not to keep up to date)

David Lang

rgerhards commented 12 months ago

Thx for the good discussion.

I am not happy with the doc in either case. Actually, accessing it has gotten worse when we tried to improve it the last time. From my perspective it's side-work that I am not so much motivated to work on as a) few people read it at all, b) it costs massive amounts of time if done correctly, c) it's very hard for me find the right words and structure for readers and d) complaints are actually only little.

So far, so bad. Now with a more positive attitude:

I could find many more points, I guess those I provided are sufficient to get you an idea of my current thinking.

To sum up, I do not know how to proceed in a way that will turn to success, and that on multiple levels. To be clear, I am pretty unhappy with the state of current doc (including the ads). But I don't see a path to success. I am all ears to concrete ideas on how to improve the doc. but these need to think about

I am also willing to give a couple of specific topics an "experimental blog try", if someone can point out good topics and why they are good topics.

davidelang commented 12 months ago

Just a note that one problem with blog posts is that they lack edits/corrections/updates. yes you can add comments, but that's limited.

But I really do value the freedom of just free-form text and images that can be written up without the weight of integrating with everything else. We need some way of doing that and having a path to then integrate into the more formal text.

David Lang

On Fri, 3 Nov 2023, Rainer Gerhards wrote:

Thx for the good discussion.

I am not happy with the doc in either case. Actually, accessing it has gotten worse when we tried to improve it the last time. From my perspective it's side-work that I am not so much motivated to work on as a) few people read it at all, b) it costs massive amounts of time if done correctly, c) it's very hard for me find the right words and structure for readers and d) complaints are actually only little.

So far, so bad. Now with a more positive attitude:

  • I myself do not like the ads. They provide ~ $3,000 per year and we have more than once discussed to ditch it. But we obviously came to no final conclusion, partly because of the negative things listed above.
  • as a side-note readthedocs is very annoying. Someone uploaded a version ages ago and then disappears. readthedocs does not take that version down or assign ownership to us. We tried to make that happen multiple times. I've given up on that.
  • I am actually thinking since summer to complement the docs with a series of blog postings. But believe it or not, I have no clear idea what people actually need. I would even create howto-videos, but the focus is pretty unclear.
  • good doc needs images/visualizations. We have noone good at this.
  • we had help in the past. I really appreciate these efforts. But they were always short-lived. While this still is super-useful, it's not something to keep things consistently in good shape. Remeber: rsyslog is a real open source project, where volunteer work is really needed
  • I am still not 100% convinced that sphinx/rst is the way to go. Reason? Personal! I don't know it well enough and doing anything non-trivial (sometimes even trivial) takes quite long for me. As I work seldomly on the doc, I tend to forget things before I touch it the next time. This is the main reason why I am thinkin about blog postings (where I am very fluent witht the system).
  • blog postings also have the advantage to avoid situations like readthedocs again (this is one aspect where open source obviously causes confusion)
  • we tried various was to document rsyslog in the past 20 yrs (man pages, blogs, website, pure html, now sphinx) - none of them really worked well. The root problem was probably beginner-friendliness and structure. I admit I am pretty demotivated to invest further time.
  • ...

I could find many more points, I guess those I provided are sufficient to get you an idea of my current thinking.

To sum up, I do not know how to proceed in a way that will turn to success, and that on multiple levels. To be clear, I am pretty unhappy with the state of current doc (including the ads). But I don't see a path to success. I am all ears to concrete ideas on how to improve the doc. but these need to think about

  • time required
  • what needs really to be addressed
  • media format

I am also willing to give a couple of specific topics an "experimental blog try", if someone can point out good topics and why they are good topics.

davidelang commented 12 months ago

As I'm thinking about it more, I see a need to have a couple different types of documentation, and we've generally focused on just one.

  1. (what we mostly have now), reference documentation of valid syntax
  2. howto documents (examples, case studies, what not to do), you have been doing some of this as blog posts.
  3. explinations (history, architecture, etc)

I am comfortable doing writing, but am not so good at making it pretty and formatting (I do markdown-type things that are barely more than plain text) I Haven't looked into sphinx, sounds like I need to. I do a minimal amount of diagraming (usually using blockdiag)

I think that I will be able to replace the ad money, but I need to wait a few weeks before I'm ready to do that (and it would probably be better if there was some company who could get value from 'sponsored by' rather than just me)

David Lang

On Fri, 3 Nov 2023, David Lang wrote:

Just a note that one problem with blog posts is that they lack edits/corrections/updates. yes you can add comments, but that's limited.

But I really do value the freedom of just free-form text and images that can be written up without the weight of integrating with everything else. We need some way of doing that and having a path to then integrate into the more formal text.

David Lang

On Fri, 3 Nov 2023, Rainer Gerhards wrote:

Thx for the good discussion.

I am not happy with the doc in either case. Actually, accessing it has gotten worse when we tried to improve it the last time. From my perspective it's side-work that I am not so much motivated to work on as a) few people read it at all, b) it costs massive amounts of time if done correctly, c) it's very hard for me find the right words and structure for readers and d) complaints are actually only little.

So far, so bad. Now with a more positive attitude:

  • I myself do not like the ads. They provide ~ $3,000 per year and we have more than once discussed to ditch it. But we obviously came to no final conclusion, partly because of the negative things listed above.
  • as a side-note readthedocs is very annoying. Someone uploaded a version ages ago and then disappears. readthedocs does not take that version down or assign ownership to us. We tried to make that happen multiple times. I've given up on that.
  • I am actually thinking since summer to complement the docs with a series of blog postings. But believe it or not, I have no clear idea what people actually need. I would even create howto-videos, but the focus is pretty unclear.
  • good doc needs images/visualizations. We have noone good at this.
  • we had help in the past. I really appreciate these efforts. But they were always short-lived. While this still is super-useful, it's not something to keep things consistently in good shape. Remeber: rsyslog is a real open source project, where volunteer work is really needed
  • I am still not 100% convinced that sphinx/rst is the way to go. Reason? Personal! I don't know it well enough and doing anything non-trivial (sometimes even trivial) takes quite long for me. As I work seldomly on the doc, I tend to forget things before I touch it the next time. This is the main reason why I am thinkin about blog postings (where I am very fluent witht the system).
  • blog postings also have the advantage to avoid situations like readthedocs again (this is one aspect where open source obviously causes confusion)
  • we tried various was to document rsyslog in the past 20 yrs (man pages, blogs, website, pure html, now sphinx) - none of them really worked well. The root problem was probably beginner-friendliness and structure. I admit I am pretty demotivated to invest further time.
  • ...

I could find many more points, I guess those I provided are sufficient to get you an idea of my current thinking.

To sum up, I do not know how to proceed in a way that will turn to success, and that on multiple levels. To be clear, I am pretty unhappy with the state of current doc (including the ads). But I don't see a path to success. I am all ears to concrete ideas on how to improve the doc. but these need to think about

  • time required
  • what needs really to be addressed
  • media format

I am also willing to give a couple of specific topics an "experimental blog try", if someone can point out good topics and why they are good topics.

grinapo commented 11 months ago

My original problem is (when I was considered to suggest how to rewrite the docs) that I realised that nobody knows how to "officially" configure rsyslog. Everywhere there is a mixup of varous syntax mix of old syslog syntax and rainerscript syntax, and people are strongly suggesting not to use this or not to use that (in fact summarizing them results not to use rsyslog at all since all of the syntax is obsolete :grin: ). It is a PITA that there are two parallel syntax for everything and nobody can actually say "this one is obsolete/useless, avoid that, do not even document that". I don't think it's possible to create a sane documentation until the developers actually decide how to use the program, pick one syntax, say that it's complete and working and performant, and that we shall use it, end of story.

After that it's possible to create a consistent documentation of that syntax, and possibly create a different and brief documentation of the obsolete syntax (without mixing it up with the current one!).

Apart from that @davidelang is prefectly correct that there needs to be a howto/tutorial structured documentation with rich and real-life examples, which shall also point to matching parts of the reference documentation (which also supposed to show examples), which clearly defines everything. And yes, using only the current syntax (whichever it may be), and not anything else.

davidelang commented 11 months ago

On Tue, 7 Nov 2023, Peter Gervai wrote:

My original problem is (when I was considered to suggest how to rewrite the docs) that I realised that nobody knows how to "officially" configure rsyslog. Everywhere there is a mixup of varous syntax mix of old syslog syntax and rainerscript syntax, and people are strongly suggesting not to use this or not to use that (in fact summarizing them results not to use rsyslog at all since all of the syntax is obsolete :grin: ). It is a PITA that there are two parallel syntax for everything and nobody can actually say "this one is obsolete/useless, avoid that, do not even document that".

If we don't document even the stuff that we don't want people to use, then new people who inherit old configs have no idea what they are doing and so can't convert them to a new syntax. So we absolutly do need to document anything that works.

I do think in the last few years of documentation updates, we have lost some of what was there at one point differntiating the old-acceptable, old-don't-use, and new-acceptable versions of things. I remember doc pages that had both new and old ways, along with examples of both. That has vanished at some point.

I don't think it's possible to create a sane documentation until the developers actually decide how to use the program, pick one syntax, say that it's complete and working and performant, and that we shall use it, end of story.

The problem I see with this is that there are decades of documentation of how to use syslog, and for doing the simple stuff, that old syntax is simple and straighforward

Syslog-ng took the path of redefining things from scratch, and it makes it possible to do very complex things, but doing even simple things is non-trivial

When it came time to replace the legacy syslog daemon in linux distros, there were a lot of people who pushed for syslog-ng, it had a longer history and better reputation for 'enterprise use'. But I think one of the big attractions that ended up driving the decision to use rsyslog instead was the backwards compatability.

Rsyslog is a bit easier to do complex things in than syslog-ng (IMHO), but we started down the road of pushing everyone to convert to the new syntax, and it did not go particularly well

After that it's possible to create a consistent documentation of that syntax, and possibly create a different and brief documentation of the obsolete syntax (without mixing it up with the current one!).

the more decoupled they get, the more they are likely to end up out of sync or contridicting each other.

Apart from that @davidelang is prefectly correct that there needs to be a howto/tutorial structured documentation with rich and real-life examples, which shall also point to matching parts of the reference documentation (which also supposed to show examples), which clearly defines everything. And yes, using only the current syntax (whichever it may be), and not anything else.

using just the new syntax for the examples will work the vast majority of the time. But simple facility/severity filters writing to files is still far clearer using the legacy format than anything else.

David Lang

rgerhards commented 11 months ago

While there is a lot one can complain about in rsyslog doc, I think it is not a clear distinction between formats. I had specifically written up doc on which formats exist and which ones are best to use (and when):

https://www.rsyslog.com/doc/v8-stable/configuration/conf_formats.html

The term "obsolete legacy format" should also, by itself, make pretty clear where to stay away from - at least so I thought (was also Mailing List consensus at the time of writing).

Also, echo doc page talking about config objects and statements tells the advanced (aka "RainerScript") format construct and has a table listing the obsolete legacy counterpart (if any). This was meant to help lookup old constructs and understand older configs.

Not bashing anyone, but I was under the firm impression that this should make crystal-clear what to use - and what to avoid. If that assumption is wrong, this proves more than anything that I am probably the wrong person to contribute doc.

But maybe there are other reasons why which format to use is so hard to understand. @georgehank did you even look at the documents in question or grasp what the "obsolete legacy" column is talking about? Honest question!

grinapo commented 11 months ago

Not bashing anyone, but I was under the firm impression that this should make crystal-clear what to use - and what to avoid.

Unfortunately not; I believe the main reason is that you are talking about "some" pages (which users may or may not even discover) while I am talking about the documentation as a whole. If someone (with no preliminary knowledge about rsyslog) come to the doc and start reading it's extremely confusing, since - as I mentioned it - it is constantly mixing up the formats, even sometimes within the same paragraph. Funny thing is that I rarely reconfigure rsyslog and every time I have to look it up in the docs and every time I am astonished by the confusion it causes and the triple amount of time required to decipher which solution shall I use for a specific problem. (syslog-ng documentation is bad in a different way, since it is mostly well written and consistent but it's structure is a mess; still, much easier to use and find the solutions there.)

I both agree and disagree with @davidelang : there may be a need for the documentation of the "old" format, but I strongly believe it shall not be mixed with the new, since that is extremely confusing, they are very different syntax, very different logic. If someone feels the need to document and compare them this shall be made into a very different doc, nevet to be touched by non-developers or non-rsyslog-hackers.

Or, maybe, if it is stuck with both syntax (which I will never understand the reason for, since it is bound to be confusing and messy) then they shall be spearated in very separate sections in the docs, probably "first half" and "second half", with only links to the other half and never to quote anything.

That is my input and opinion and view, from a "newbie" point of view (though I am certainly not that in multiple ways).

rgerhards commented 11 months ago

If someone (with no preliminary knowledge about rsyslog) come to the doc and start reading it's extremely confusing, since - as I mentioned it - it is constantly mixing up the formats,

Can you post some links to those pages? Just so that I have some concrete examples.

rgerhards commented 11 months ago

If someone feels the need to document and compare them this shall be made into a very different doc, nevet to be touched by non-developers or non-rsyslog-hackers.

I don't think that folks agree who have old configs to maintain. They would not know how to tackle them. Plus the basic format is good to go for simple use cases AND distro default configs contain it.

computerquip-work commented 11 months ago

Can you post some links to those pages? Just so that I have some concrete examples.

The obvious one is the conversion page: https://www.rsyslog.com/doc/master/configuration/converting_to_new_format.html It encourages the use of basic syntax for simple use cases.

There's also the actions page which says: https://www.rsyslog.com/doc/master/configuration/actions.html

Be warned that legacy action format is hard to get right. It is recommended to use RainerScript-Style action format whenever possible!

But doesn't say anything about basic syntax then uses basic syntax in the examples.

Your own blog post suggests starting with basic syntax and only using RainerScript for advanced cases: https://rainer.gerhards.net/2013/02/should-i-use-rsyslogs-new-or-old-config-style.html

It seems basically inevitable that every configuration file ends up with a mishmash of several different configuration formats and it still doesn't seem clear to me where the distinction sometimes starts and ends.

I don't think that folks agree who have old configs to maintain. They would not know how to tackle them. Plus the basic format is good to go for simple use cases AND distro default configs contain it.

A lot of the default configuration is based off the sample.conf within the repository which doesn't use a lick of RainerScript.

people have been complaining about the mixed format for the better part of a decade. The reason people aren't vocal about it is generally because they do literally one thing with it after spending 3 hours figuring out the configuration and then never touch it again. Or they just use something else after getting frustrated with it and not getting a response on ServerFault.

rgerhards commented 11 months ago

@computerquip-work It's interesting insight from the novice PoV (which I obviously totally miss).

Nevertheless, some notes:

As I said, there was a lot of valuable feedback in the answer, and I appreciate any more.

But keep warned that we cannot rule the Internet as whole, so what a google search returns is not the rsyslog doc. Just as a gentle reminder. Rsyslog is, under that name, for more than two decades in use, and a lot of then-right advise has been published in that time frame, most still valid, but probably outdated in style. That in itself is a reason why IMHO we need to maintain backward compatibility: there is a wealth of useful information on the web, and we do not want to invalidate it be limiting config options. But, and this is important: we can for sure do much better in advising what to use in modern deployments.

grinapo commented 11 months ago

Which Format should I Use? […] While it is an older format, the basic format is still suggested for configurations that mostly consist of simple statements. […] For anything more advanced, use the advanced format.

That is what the first time user gets. Now, the user starts thinking: why is there the "basic" format, which is unreadable, badly structured and visibly multiple decades old? They may - possibly correctly - think that it is documented because:

still being taught in courses and a lot of people know this syntax. It is perfectly fine to use these constructs even in newly written config files. Note that many distributions use this format in their default rsyslog.conf, so you will likely find it in existing configurations.

So it is basically documented because this format is still around and people need to understand it, but "clearly" the document "suggests" to use the "new" format.

So, the imaginary first time reader (correctly??) decided that they will learn the advanced format, since sanity dictates that one shall not start learning obsolete formats for current use. Also old format is unreadable. Borderline illogical in many places. Lot of hidden syntax. Lot of obscure one-character functions. So, they start reading the documentation with that approach, which is "use only the modern format and do not touch anything obsolete", it is clean, it is sane.

Now, enter almost any page and check whether the examples are in clean advanced format, and the page contains everything about usage, variables, templates, whatever in advanced format, and whether it is easily possible to skip parts which contain any bits of obsolete formats (legacy and basic). "Easy" in my case means that I do not have to read it, it is separated in distinct section or a separate page.

No. This is the next page to the above:

Do not overdo conversion […]We suggest you leave it as-is without conversion. Equally, in our opinion it is also fine to add new rules like the above. If you still want to convert, the line may look as follows (completely in new format): […]

And there a bits like this:

Note: in obsolete legacy format it is possible to provide global parameters more than once. In this case it is unclear which one actually applies. For example: […] This is especially problematic if module-global parameters are used multiple times in include files. In advanced format this is no longer possible.

So, um, I cannot even use advanced format for some things? Wait:

NOTE: Some actions do not have a basic format configuration line. They may only be called via the action() syntax. Similarly, some very few actions, mostly contributed, do not support action() syntax and thus can only be configured via basic and obsolete legacy.

So "you" suggest to mix wildly different syntax, because they are all "broken" ("advanced" one cannot do things the basic can so you are forced to mix them no matter what)? That seems to be the logical conclusion reading just the first few page of the documentation.

The reader here watch their brain melting, facepalms, and possibly need a long walk on the fresh air. From them on the documentation is "consistently" make a mess of oscillating between configuration syntaxes(sp?).

As I have checked almost any random page is a good example. I have clicked on 10 random titles and found none which contained coherent advanced format examples not mixed with basic or legacy or syslog or else.

I (and possibly large amount of the users) do not want to use multiple syntax in config files. I think I am prefectly capable of typing "filter" instead of "@", and typing braces and semicolons are pretty easy for me too. The documentation does not support me. And yes, this has been going on for possibly decades, people complaining, and resorting to bits and pieces outside the "official" documentation which does not mix up different syntax.

grinapo commented 11 months ago

blog post: that one is from 2013 - I cannot update anything written in the past 30 years.

People usually add a first line "this article contains historical facts which no longer apply".

Uusally this is not needed, since people can surely read the current documentation and get their answers replied there, up to date, state of the art. One may start thinking: why people have to read old blog posts and stackexchange to get their pretty common questions answered? Why do they have questions about "which syntax" to use at all? Why are they all seem very confused, for decades now?

I am not sure there is any project I have seen in the last, um, 3-4 decades which uses multiple syntax mixed in the config files, apart from rsyslog. Lot of projects updated their config format but they clearly picked one and obsoleted and replaced the rest.

rgerhards commented 11 months ago

That is what the first time user gets. Now, the user starts thinking: why is there the "basic" format, which is unreadable, badly structured and visibly multiple decades old?

No! This is exactly not the case. Basic format is not unreadable (YMMV) or badly structured. It's by far the easiest way to express things.

I'd say that

*.info /var/log/info.log

beats

if prifilt('*.info') then
    action(type="omfile" file="/var/log/info.log")

in simplicity in the many cases where only such constructs are needed.

rgerhards commented 11 months ago

Lot of projects updated their config format but they clearly picked one and obsoleted and replaced the rest.

To me, as a datacenter guy, this is today's IT horror cabinet. Every new version of a project may break your working-since-years config, just because someone again changed config parameters, formats, default and statements.

It may be currently hyper-modern to do so, but IMHO is also at the root why everybody is so concerned with properly updating their system. An update may at any moment break everything purely for side-effects. That's not my view of enterprise computing (and I honestly think this stance is one of the root causes for security incidents).

NEVERTHELESS, I find the input very valuable, especially as it provides totally different views and opinions than my own. The question now is if we find a team of volunteers to create the new type of doc you are suggesting. Even if not, we can slowly and gradually work on improving the existing set, but this takes time and it is of course problematic if we do not have proponents of the "ditch the doc for old format" with on board. As a side-note, we really cannot totally ditch it, as there are far too many references in printed books and on the internet that use it and it is important to tell people how to understand this (in case they search for some solution and find older formats in search).

rgerhards commented 11 months ago

Just a side-note: I thought and sat down to write some kind of "rsyslog for beginners" piece of doc, mentioning only the advanced (new-style) rsyslog config. But as soon as I did, I noticed that would leave especially beginners totally confused. What they see when the install rsyslog distro packages is sysklogd format. So sparing that out of a beginner's guide would leave them totally lost.

All in all, I conclude more than a bit frustrated: there seems to be no right way to improve the doc. In my current mood, I tend to say it's a waste of time, as always somebody prefers different structure. @davidelang what do you think?

davidelang commented 11 months ago

I see clear ways to improve the docs, but I agree that there is a lot of disagreement. I think we need some writeups similar to what I did early in this discussion and then drill down from there.

I think we need a lot more examples than I'm seeing now and more, not less documentation of the old formats.

I think that most of what we have is reference material, outlining all the options. This is needed, but we also need 'how to get started' and 'why do things a particular way' type of docs as well, with such docs then linking down into the reference material docs fpr details.

I also think we need to mine your blog posts and import many of them into the docs where they can get updated as appropriate.

work is a bit crazy right now, but I do plan to work on this in the near future.

David Lang

On Wed, 8 Nov 2023, Rainer Gerhards wrote:

Just a side-note: I thought and sat down to write some kind of "rsyslog for beginners" piece of doc, mentioning only the advanced (new-style) rsyslog config. But as soon as I did, I noticed that would leave especially beginners totally confused. What they see when the install rsyslog distro packages is sysklogd format. So sparing that out of a beginner's guide would leave them totally lost.

All in all, I conclude more than a bit frustrated: there seems to be no right way to improve the doc. In my current mood, I tend to say it's a waste of time, as always somebody prefers different structure. @davidelang what do you think?

grinapo commented 11 months ago

I tried to avoid reacting Rainer's comments since we clearly have wildly opposite views on the readibility, usability and purpose of basic and advanced formats.

But I believe it is an important prerequisite of writing the documentation to make it clear: what is the official format? The basic, the advanced or the mishmash of both? If the last, then I do not think there is a point to put lot of energy on a new documetation which is bound to be confusing for that large group of people seeing the world differently than Rainer. :-)

(Am am now even pretty confused about why rainerscript exists at all if it is not superior to the old format. I thought it's there to make an - in my strong opinion - unreadable and obscure syntax more human eyes friendly, and also more flexible and clear, and seemed like a great idea, but I am not sure anymore. Still, I cannot repeat enough that two syntaxen together are extremely confusing.)

computerquip-work commented 11 months ago

I don't think that removing a largely in-use configuration format is a good idea. As a matter of fact, I believe documenting the basic syntax is important (although I think that in itself could be improved). I will say that mixing the two is confusing and that it's a hard problem to tackle.

computerquip-work commented 11 months ago

Oh, this is closed now. Okay.

rgerhards commented 11 months ago

kind of mistake - I was about to comment but than thought twice about it, deleted it and frustration finally took over. Sorry for that.

Feel free to continue to discuss here. but I am out at least for the time being. Doesn't help to learn that this is actually unsolvable. Especially if real-world hard facts are simply ignored. I keep rsyslog useful in the datacenter and breaking changes are definitely a thing most real data center admins (I mean where it really matters) do not want do have.

davidelang commented 11 months ago

for simple things, the old format is better. But it got to the point where you hade

$foo a $bar b for dosomething2 dosomething $baz c dosometthing2

where it's very hard to track what settings apply to what

so instead rainerscript is

dosomething(foo=a, baz=c) dosomething2(bar=b)

so if you are doing simple stuff (that can be represented in a single line) the legacy format is probably better

if you are doing things that require multiple lines, rainerscript is better (with a few exceptions)

David Lang

On Wed, 8 Nov 2023, Peter Gervai wrote:

I tried to avoid reacting Rainer's comments since we clearly have wildly opposite views on the readibility, usability and purpose of basic and advanced formats.

But I believe it is an important prerequisite of writing the documentation to make it clear: what is the official format? The basic, the advanced or the mishmash of both? If the last, then I do not think there is a point to put lot of energy on a new documetation which is bound to be confusing for that large group of people seeing the world differently than Rainer. :-)

(Am am now even pretty confused about why rainerscript exists at all if it is not superior to the old format. I thought it's there to make an - in my strong opinion - unreadable and obscure syntax more human eyes friendly, and also more flexible and clear, and seemed like a great idea, but I am not sure anymore. Still, I cannot repeat enough that two syntaxen together are extremely confusing.)

computerquip-work commented 11 months ago

I'm not advocating for breaking changes.

I do think mixing the formats are confusing because it causes inconsistencies. The example you gave where the advanced format is more verbose is sort of cheating I feel like since you could just do *.info action(type="omfile" file="/var/log/info.log"). Regardless though, the moment you start introducing symbols such as *.* @@192.0.2.1:10514, things get really confusing really fast. For example, @@ is technically "basic" syntax. @ is technically "legacy" since it existed in sysklogd? Is *.* action(...) mixing legacy with advanced? In addition, all are discouraged in some guides and encouraged in others. The documentation states that legacy is obsolete here so for what is now the third or fourth time I've stated this, it is not clear to me the distinction between the three formats.

Despite the above, I don't necessarily think that rsyslog should "choose a side", I think it should just better document the distinctions between the three different formats. Actually have a list of legacy that people shouldn't use. Have an explanation on what is basic and what is legacy.

It is a little bit frustrating that some concrete examples of where the documentation could be improved end up with responses claiming its unsolvable. I really don't think it takes someone new (nor am I really that new at this point) to at least understand the frustration points. I'm not talking about just the initial learning curve, I'm talking about it as a reference and guideline in general. If I want to look up some current syntax in the documentation as a reference, it's not that easy as I've pointed out.

Anyways, I've said my piece, I'll not comment further.

davidelang commented 11 months ago

On Wed, 8 Nov 2023, computerquip-work wrote:

I'm not advocating for breaking changes.

It sounded like you were arguing that we should abandon the legacy config stuff entirely

I do think mixing the formats are confusing because it causes inconsistencies. The example you gave where the advanced format is more verbose is sort of cheating I feel like since you could just do *.info action(type="omfile" file="/var/log/info.log"). Regardless though, the moment you start introducing symbols such as *.* @@192.0.2.1:10514, things get really confusing really fast. For example, @@ is technically "basic" syntax. @ is technically "legacy" since it existed in sysklogd? In addition, both are discouraged in some guides and encouraged in others. The documentation states that legacy is obsolete here so for what is now the third or fourth time I've stated this, it is not clear to me the distinction between the three formats.

@foo @@bar /var/log/baz ?pathtemplate;linetemplate

all are legacy or sysklogd and still sometimes the best way to do things

action() is the new syntax

but in between these, there are a lot of options that you could set to create queues, set encryption, and other advanced things that required you to do a lot of $setting foo $setting2 bar ahead of the legacy line.

It's this combination of the legacy action lines and all the other settings ahead of it that we discourage (and has gotten lost from at least some of the docs)

There are a small number of cases where this multi-line setup is still useful

for example, the rsyslog.conf on my ubuntu system has

$FileOwner syslog $FileGroup adm $FileCreateMode 0640 $DirCreateMode 0755 $Umask 0022

and then later (in an included file) has the traditonal

auth,authpriv. /var/log/auth.log .;auth,authpriv.none -/var/log/syslog kern. -/var/log/kern.log mail. -/var/log/mail.log mail.err /var/log/mail.err .emerg :omusrmsg:*

switching all of these lines to something along the lines of:

action(type="omfile: fileowner="syslog" filegroup="adm" filecreatemode="0640" dircreatemode="0755" umask="0022" file="/var/log/auth.log)

does not make it easier to see what's happening, and it can be argued that setting the same thing many times is an invitation to typos.

(ignoring the filters, even the 'prifilter() function that Rainer used in his last example is a shortcut, to be 'proper' per languange pureists, it would be 'if severity > 3 then' instead of using the prifilter)

But, some of the '$setsomething to be used later' weren't settings that stay consistant until redefined, some (like the queue configs) only apply to the next action

looking at this redhat page: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/deployment_guide/s1-working_with_queues_in_rsyslog

you have

$ActionQueueType LinkedList $ActionQueueFileName example_fwd $ActionResumeRetryCount -1 $ActionQueueSaveOnShutdown on . @@example.com:6514

(this is a case where @@ isn't the problem, it's the combination of @@ and the other things)

this is and example of many lines to define one action, where these definitions only apply to the next action, so if you change the config to:

$ActionQueueType LinkedList $ActionQueueFileName example_fwd $ActionResumeRetryCount -1 $ActionQueueSaveOnShutdown on /var/log/testmessage . @@example.com:6514

the queue now applies to the write to testmessage instead of the tcp relay

or

$ActionQueueType LinkedList $ActionQueueFileName example_fwd /var/log/testmessage $ActionResumeRetryCount -1 $ActionQueueSaveOnShutdown on . @@example.com:6514

where now you don't have a complete queue definition for either action

changing this to action() format makes it clear what the queue applies to

Despite the above, I don't necessarily think that rsyslog should "choose a side",

Apologies if I am mixing your posts up with someone else's, but someone has been saying that we need to decide which syntax is proper and do everything we can to eliminate the other. Very much 'picking a side'. In the past we've had people advocate that we should remove support for the old syntax, which would break existing configs. Some of the documentation in the past that talked about the different syntax options was written at a time when that was being considered.

I think it should just better document the distinctions between the three different formats. Actually have a list of legacy that people shouldn't use. Have an explanation on what is basic and what is legacy.

I thought we did. Unfortunantly, it looks like some prior doc cleanup has removed a lot of the documentation about the legacy options and the examples of what not to do. This was someone who believed as you do that the docs should only show the 'proper' way of doing things

It is a little bit frustrating that some concrete examples of where the documentation could be improved end up with responses claiming its unsolvable. I really don't think it takes someone new (nor am I really that new at this point) to at least understand the frustration points. I'm not talking about just the initial learning curve, I'm talking about it as a reference and guideline in general.

There are some conflicting requirements here (document all syntax vs document only preferred syntax), and it's really hard for people who know this stuff inside and out to see all the issues (legacy vs sysklogd being a perfect example)

Anyways, I've said my piece, I'll not comment further.

I don't think that there is anyone who thinks the rsyslog documentation is steller, but trying to fix it seems to be dancing in a minefield.

I think that it has shown some things that we can do better, but getting the time to do so is harder than it should be.

David Lang

rgerhards commented 11 months ago

I thought we did. Unfortunantly, it looks like some prior doc cleanup has removed a lot of the documentation about the legacy options and the examples of what not to do.

The format descriptions are IMHO fully descriptive inside the doc: https://www.rsyslog.com/doc/v8-stable/configuration/conf_formats.html

We converted sample code from obsolete legacy to advanced style. But all legacy statements are still documented, on the page that describes advanced format. One may argue that makes them too hard to find, but if the same one argues that the descriptions should go away completely, it makes the argument weak.

I think that it has shown some things that we can do better, but getting the time to do so is harder than it should be.

It's not just the time. My frustration is so deep because I have often listened to advise like dispensed in this thread. Even when it is sound (some in this thread is not), and we changed it, yet another thread was started telling why exactly how we do it is bad. Here, the advise is partly contradictory in itself. But I know form 20 yrs of discussion that we will see an immense wave of bad mood when we remove doc for old statements altogether - much more so if we would actually remove the config language itself. If that argument is ignored as "Rainer's crazy stance" and not even considered, to me it is a waste of time to further discuss with those folks.

Bottom line:

Well, and now I have wasted another set of time in which I could probably have been more productive. Seems like the topic still drives me crazy ;-)

grinapo commented 11 months ago

some in this thread is not

Feels like someone's talking about me, and also feels like I am very much misunderstood, mainly because I try not to write small essays in an issue ticket.

So let me try to summarize what I meant, and trying to stay brief.

  1. If the Powers To Be would decide on one format, then the mainline documentation shall consistently use that format, both in description and in the examples.
  2. The other format and the legacy format shall have documentation, since they are still valid and parseable by rsyslog. However these formats already have documentation, right now!
  3. So what I meant is that there ought to be a new documentation, which would use one syntax and possibly restructured to guide, reference and glossary sections, and
  4. the old documentation would stay online, and it could be referenced, linked to by the new one.
  5. This way everything is documented, since the "current" syntax is in the new documentation, and the old syntaxen are in the current ("old") documentation. If it is good today, surely it will be good tomorrow. ;-)

Also I do not know you well, Rainer, but in my experience (and it goes for 30+ years now) good programmers "not always" the same persons who write good documentation, mainly due to the fact that they are very familar with the code structure and their documentation follows that logic. Also there are lot of "obvious" things they skip to explain while newbies have no knowledge about these obvious things at all. In my experience non-programmer (and non rsyslog-internals master) users need a differently structured documentation and it may be possible that you have a very concrete image of how you would document it, while (as you say) 20+ years went by users complaining about it, so maybe they really need some different kind of documentation structure. I would say it could be beneficial to look around for documentation people generally think being excellent, or look around for people who have experience writing (good) documentation.

I possibly have completely misphrased what I meant, which is not "crazy Rainer" but that your personal approach may be unfitting for what the people would expect or could easily use. It happened many times that "documentation people" designed a framework for me or the projects I was involved and my job was to fill the slots with knowledge. (Also writing documentation is tedious and time consuming, and maybe it's better if you can do the coding while others sweat on creating docs and debate with annoying users. Even in this issue there are more people offering help, and I guess there would be even more around the 20+ years old rsyslog community.)

Also, I am not sure that this issue is the best way to discuss the attributes of a good documentation. I can offer interfaces like https://hedge.grin.hu/ to create collaborative design, guidelines, mockups, whatever, but unfortunately it would not work until it's clear which one syntax shall the "new" documentation describe (which should, I emphasize, link to the old documentation containing description of the old syntax). (Also this was the main reason that I haven't started creating a sample documentation since even I cannot know how configuration is supposed to be used canonically; this is the most important step for creating a documentation of something.)

Apologies for the verbose comment.

grinapo commented 11 months ago

In a different comment I just have to share that my *syslog configs are often non-obvious, that is one very specific reason that I prefer structured configuration syntax instead of "obscure": character based ones. I often collect logs from various sources, send them encrypted to collector servers using whatever protocol, often filtering and parsing them, or cloning/splitting in due course. This is not the "distribution defaults" where a priority goes into a specific file and that's all. This maybe one reason that my comments are… like they are, since I know the problems I already had with the mixed syntax.

rgerhards commented 11 months ago

Also I do not know you well, Rainer, but in my experience (and it goes for 30+ years now) good programmers "not always" the same persons who write good documentation, mainly due to the fact that they are very familar with the code structure and their documentation follows that logic. Also there are lot of "obvious" things they skip to explain while newbies have no knowledge about these obvious things at all. In my experience non-programmer (and non rsyslog-internals master) users need a differently structured documentation and it may be possible that you have a very concrete image of how you would document it, while (as you say) 20+ years went by users complaining about it, so maybe they really need some different kind of documentation structure. I would say it could be beneficial to look around for documentation people generally think being excellent, or look around for people who have experience writing (good) documentation.

I possibly have completely misphrased what I meant, which is not "crazy Rainer" but that your personal approach may be unfitting for what the people would expect or could easily use. It happened many times that "documentation people" designed a framework for me or the projects I was involved and my job was to fill the slots with knowledge. (Also writing documentation is tedious and time consuming, and maybe it's better if you can do the coding while others sweat on creating docs and debate with annoying users. Even in this issue there are more people offering help, and I guess there would be even more around the 20+ years old rsyslog community.)

I couldn't agree more on that. The problem is, as I wrote, that only very seldom folks show up (@deoren was a notable exception and helped a lot getting things IMHO to the better side). It is even more a problem to make those very few people stick,

I do not object at all an approach where good doc writers gather together (on what platform ever) and do great things. I can offer my best advise and would be extremely happy if that would materialize. Out of past experience, the end result would probably need to be at least maintainable by me, in case the effort vanishes at some point.

For the very same reasons you give, I have always said that I am the worst person to write the doc, yet I am the only one who consistently is available to do it.

Thus, from my PoV, I consider this situation as "unsolvable", just to explain that term. I am deeply frustrated about the state of doc and I do not know any decent way how I could make it better. Tried so many approaches the in the past decades.

And, just as reference, I am 40+ yrs in IT, if you like a quick overview, head here: https://rainer.gerhards.net/biography-kind-of (a bit dated as well, but still correct).

rgerhards commented 11 months ago

but unfortunately it would not work until it's clear which one syntax shall the "new" documentation describe

I thought this is clear "While rsyslog supports all three formats concurrently, you are strongly encouraged to avoid using the obsolete legacy format. Instead, you should use the basic format for basic configurations and the advanced format for anything else." (I quoted the link before and you commented on it).

As you say, that my just be clean in my mind. Let's try to re-phrase:

"Advanced format is the way to go".

I personally still think we need to prominently tell folks that other formats are out there, and they will be exposed to them as soon as they open their distros /etc/rsyslog.conf or do a random google search on rsyslog or syslog. There is just so much legacy.

It's a mood point if basic format is a good fit as well (IMHO yes, but YMMV).

rgerhards commented 11 months ago

In a different comment I just have to share that my *syslog configs are often non-obvious, that is one very specific reason that I prefer structured configuration syntax instead of "obscure": character based ones.

That's my point: you have a bias as well.

Small biz folks or hobbyists who run rsyslog to see what their router tells them, may want to add just one line for a new file. Given the fact that distros and tutorial have basic (sysklogd, legacy) format, I personally would assume it is a far lesser learning curve to tell them how to add just another, similar, line.

There are obviously very different needs. But, right: we do not address any of these needs adequately right now. At least this is what I get from discussions.