389ds / 389-ds-base

The enterprise-class Open Source LDAP server for Linux
https://www.port389.org/
Other
211 stars 91 forks source link

Add new logging severity level to the errors log #2037

Closed 389-ds-bot closed 4 years ago

389-ds-bot commented 4 years ago

Cloned from Pagure issue: https://pagure.io/389-ds-base/issue/48978


There is a need to improve/clarify our error logging. It can be hard to tell when a message is an error, or something benign. Adding a new severity parameter would resolve this. The new logging type should follow https://tools.ietf.org/html/rfc3164

We don't necessarily have to use all these levels, but the levels we use should comply with this list. This can also be used if logging is redirected to journald

           0       Emergency: system is unusable
           1       Alert: action must be taken immediately
           2       Critical: critical conditions
           3       Error: error conditions
           4       Warning: warning conditions
           5       Notice: normal but significant condition
           6       Informational: informational messages
           7       Debug: debug-level messages

Example:

slapi_log_error(SLAPI_LOG_REPL, LOG_ERROR, repl_subsystem, "message\n");

Logged as:

[02/Sep/2016:10:38:11.188881144 -0400] NSMMReplicationPlugin - ERROR - message

We should also add a new logging level config setting that can filter based on the severity. This should work independently of the normal log level: trace function calls, replication, plugin, etc. So even if we don't have replication logging enabled, we will still log that replication message if the severity is set.

389-ds-bot commented 4 years ago

Comment from rmeggins (@richm) at 2016-09-03 03:17:49

The latest version of the syslog rfc is https://tools.ietf.org/html/rfc5424

But in our code we should just use the macros defined in syslog.h as described in man 3 syslog:

   level
       This  determines  the  importance  of  the message.  The levels are, in
       order of decreasing importance:

       LOG_EMERG      system is unusable

       LOG_ALERT      action must be taken immediately

       LOG_CRIT       critical conditions

       LOG_ERR        error conditions

       LOG_WARNING    warning conditions

       LOG_NOTICE     normal, but significant, condition

       LOG_INFO       informational message

       LOG_DEBUG      debug-level message

       The function setlogmask(3) can be used to restrict logging to specified
       levels only.
389-ds-bot commented 4 years ago

Comment from mreynolds (@mreynolds389) at 2016-09-06 22:17:39

Refractoring tasks:

SLAPI_LOG_FATAL\LDAP_DEBUG_ANY = 3683 SLAPI_LOG_PLUGIN = 2035 SLAPI_LOG_TRACE = 1437 SLAPI_LOG_REPL = 604 SLAPI_LOG_ACL = 232 SLAPI_LOG_FILTER = 158 SLAPI_LOG_CONNS = 96 SLAPI_LOG_BACKLDBM = 95 SLAPI_LOG_ARGS = 76 SLAPI_LOG_CONFIG = 58 SLAPI_LOG_CACHE = 35 SLAPI_LOG_HOUSE = 17 SLAPI_LOG_SHELL = 6 SLAPI_LOG_BER = 1

I also suggest we keep it to 4 severity levels:

I'm going to do a mass find/replace of all the functions and add the same severity level to each logging type. So all replication logging will be LOG_DEBUG, all FATAL with be LOG_ERR, etc. Once that is committed, then we can go and fine tune each function call in batches.

389-ds-bot commented 4 years ago

Comment from rmeggins (@richm) at 2016-09-06 22:29:09

Replying to [comment:3 mreynolds389]:

Refractoring tasks:

  • LDAPDebug() needs to be replaced with slapi_log_error()

This will remove the ability to compile without those logging statements. This may have performance implications.

  • 330 files will need to be updated
  • 7793 function calls need to be updated

SLAPI_LOG_FATAL\LDAP_DEBUG_ANY = 3683 SLAPI_LOG_PLUGIN = 2035 SLAPI_LOG_TRACE = 1437 SLAPI_LOG_REPL = 604 SLAPI_LOG_ACL = 232 SLAPI_LOG_FILTER = 158 SLAPI_LOG_CONNS = 96 SLAPI_LOG_BACKLDBM = 95 SLAPI_LOG_ARGS = 76 SLAPI_LOG_CONFIG = 58 SLAPI_LOG_CACHE = 35 SLAPI_LOG_HOUSE = 17 SLAPI_LOG_SHELL = 6 SLAPI_LOG_BER = 1

I also suggest we keep it to 4 severity levels:

  • LOG_ERR
  • LOG_WARNING
  • LOG_INFO
  • LOG_DEBUG

Why not use all of the syslog(3) log levels and macros?

389-ds-bot commented 4 years ago

Comment from mreynolds (@mreynolds389) at 2016-09-06 22:51:27

Replying to [comment:4 richm]:

Replying to [comment:3 mreynolds389]:

Refractoring tasks:

  • LDAPDebug() needs to be replaced with slapi_log_error()

This will remove the ability to compile without those logging statements. This may have performance implications.

Okay, well we can just update it use the new severity level then. No big deal.

I also suggest we keep it to 4 severity levels:

  • LOG_ERR
  • LOG_WARNING
  • LOG_INFO
  • LOG_DEBUG

Why not use all of the syslog(3) log levels and macros?

I was just trying to keep it simple for the end user. Less to filter on I suppose. In syslog there are 4 different types of errors - I was just trying to make it into a single error type. Less confusing for customers, but that's also dumbed down I suppose (not maximizing its potential).

We can use all of them. No problem.

Thanks for the feedback!

389-ds-bot commented 4 years ago

Comment from mreynolds (@mreynolds389) at 2016-09-08 02:50:02

Part 1 - Add new sev level parameter to logging functions 0001-Ticket-48978-Update-the-logging-function-to-accept-s.patch

389-ds-bot commented 4 years ago

Comment from mreynolds (@mreynolds389) at 2016-09-08 02:50:22

Part 2 - refactor LDAPDebug 0001-Ticket48978-refactor-LDAPDebug.patch

389-ds-bot commented 4 years ago

Comment from mreynolds (@mreynolds389) at 2016-09-08 02:51:47

Part 3 - Refector slapi_log_error, and misc issues 0001-Ticket-47978-Refactor-slapi_log_error.patch

389-ds-bot commented 4 years ago

Comment from mreynolds (@mreynolds389) at 2016-09-08 02:55:15

These first few patches are just setting the default severity levels. Anything with LDAP_DEBUG_ANY/SLAPI_LOG_FATAL gets set to LOG_ERR, otherwise its set to LOG_DEBUG. Part 2 & 3 and very large patches.

Next, add new logging level parameter to filter on severity.

After that, fine tune LOG_ERR's. For example in main.c many error messages are informational.

389-ds-bot commented 4 years ago

Comment from nhosoi (@nhosoi) at 2016-09-08 04:08:32

Your patches look good.

Sorry, to bring it up now... I've found these related bugs/tickets. They look like a dup of your effort. But it has some extra requirements. For instance, how about adding a config option to select printing the severity in a string or a digit for 48260? Bug 1257459/Ticket 48260 - Add classification information to error logs and a way to process them in an automated fashion. Bug 1369592 - RFE - Option to add errorlog level for specific component.

389-ds-bot commented 4 years ago

Comment from mreynolds (@mreynolds389) at 2016-09-08 06:10:12

Replying to [comment:7 nhosoi]:

Your patches look good.

Sorry, to bring it up now... I've found these related bugs/tickets. They look like a dup of your effort. But it has some extra requirements. For instance, how about adding a config option to select printing the severity in a string or a digit for 48260? Bug 1257459/Ticket 48260 - Add classification information to error logs and a way to process them in an automated fashion. Bug 1369592 - RFE - Option to add errorlog level for specific component.

I'll look into this. I'm far from done with this ticket - I just wanted to commit things in phases.

389-ds-bot commented 4 years ago

Comment from nhosoi (@nhosoi) at 2016-09-08 07:23:54

Great. If that's the case you have my ack on 0001-Ticket-48978-Update-the-logging-function-to-accept-s.patch 0001-Ticket48978-refactor-LDAPDebug.patch​ 0001-Ticket-47978-Refactor-slapi_log_error.patch​

389-ds-bot commented 4 years ago

Comment from mreynolds (@mreynolds389) at 2016-09-08 07:31:25

11a5b1e..125486d master -> master commit 125486d16b0d8d1a2bef6e98b9f81cd427a839ef Part 3 commit b7631218d5fa118aae223370e1ddda66d93168e2 Part 2 commit 051e0eb50c74d5208355253ff53b301c262c7df2 Part 1

Investigating ticket 48260 next...

389-ds-bot commented 4 years ago

Comment from firstyear (@Firstyear) at 2016-09-08 07:38:46

Will this eliminate the LDAPDebug*() macros? I would be happy to see them all disappear ....

389-ds-bot commented 4 years ago

Comment from firstyear (@Firstyear) at 2016-09-08 07:42:42

My biggest concern is how we will determine what logs to show. If you have:

slapi_log_error(SLAPI_LOG_REPL, LOG_ERROR, repl_subsystem, "message\n");

Do we show this at the normal 8192 level? Or only if repl log in enabled? I think that we could just have some better log levels, and change the default perhaps to show a few of them. Perhaps even this really is exposing the issue that we have log messages being issue to the wrong level in the first place .... The logging code is already a train wreck, I think this makes it more complex ....

389-ds-bot commented 4 years ago

Comment from firstyear (@Firstyear) at 2016-09-08 07:51:54

Actually, reading the first patch, had I of seen this in time, I would NACK this.

This is misleading to the developer. When I put in

slapi_log_error(SLAPI_LOG_REPL, LOG_ERROR, repl_subsystem, "message\n");

I expect that message to be yelled from the rooftops as an error. Something has seriously gone wrong, LOG_ERROR!

But that's not what this patch does. If the log level is not set to recieve LOG_REPL, this will go into the ether! That's not what we want!

The real issue, is log messages being put to the wrong levels, and that we don't have a SLAPI_LOG_INFO, SLAPI_LOG_DEBUG etc. I don't like this approach I'm sorry.

389-ds-bot commented 4 years ago

Comment from lkrispen (@elkris) at 2016-09-08 13:34:28

I agree with William, if we label someting as error it should be always logged, if it is only logged under a specific debug level it should be a warning.

Another area. If I understand it correctly, now the startup ..; Listening ..., shutdown .., Waiting for threads all we be ERROR, but it is no error. Maybe NOTICE would be better:

LOG_NOTICE normal, but significant, condition

389-ds-bot commented 4 years ago

Comment from mreynolds (@mreynolds389) at 2016-09-08 17:35:15

Replying to [comment:13 Firstyear]:

Actually, reading the first patch, had I of seen this in time, I would NACK this.

This is misleading to the developer. When I put in

slapi_log_error(SLAPI_LOG_REPL, LOG_ERROR, repl_subsystem, "message\n");

Sorry this was just an example - there are currently no occurrences of this pattern.

I expect that message to be yelled from the rooftops as an error. Something has seriously gone wrong, LOG_ERROR!

But that's not what this patch does. If the log level is not set to recieve LOG_REPL, this will go into the ether! That's not what we want!

The real issue, is log messages being put to the wrong levels, and that we don't have a SLAPI_LOG_INFO, SLAPI_LOG_DEBUG etc. I don't like this approach I'm sorry.

389-ds-bot commented 4 years ago

Comment from mreynolds (@mreynolds389) at 2016-09-08 17:55:17

Replying to [comment:14 elkris]:

I agree with William, if we label someting as error it should be always logged, if it is only logged under a specific debug level it should be a warning.

Another area. If I understand it correctly, now the startup ..; Listening ..., shutdown .., Waiting for threads all we be ERROR, but it is no error. Maybe NOTICE would be better:

LOG_NOTICE normal, but significant, condition

Sorry if I was not clear in my comments in this ticket, but I will reiterate.

These patches were the first stage of factoring the logging. I stated that we NOW need to fine tune the messages. There is much more work needed in this ticket.

After doing the refactoring, I realized that all verbose logging will be DEBUG or something similar(but never ERROR), and we need to re-evaluate all the FATAL messages. For example, I stated in comment 6, we need to fine tune the FATAL messages in main.c, etc. I plan on working on this today.

Everyone, please ignore this example below - this is will never happen.

slapi_log_error(SLAPI_LOG_REPL, LOG_ERROR, repl_subsystem, "message\n");
389-ds-bot commented 4 years ago

Comment from lkrispen (@elkris) at 2016-09-08 20:41:16

ok, understood

389-ds-bot commented 4 years ago

Comment from rmeggins (@richm) at 2016-09-09 03:47:51

You can use __func__ to get the name of the current function. Looks like ipa is using that in more recent code. See http://www.delorie.com/gnu/docs/gcc/cpp_21.html

e.g. instead of

        LDAPDebug(LDAP_DEBUG_ANY, LOG_ERR, "check_oid - Non numeric oid %s passed to check_oid\n",oid,0,0); 
        slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, "slapi_be_setentrypoint", 
    580                 "Unknown entry point %d\n", entrypoint); 

use

         LDAPDebug(LDAP_DEBUG_ANY, LOG_ERR, __func__ " - Non numeric oid %s passed to " __func__ "\n",oid,0,0); 
       slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, __func__, ...

however, it says that "... Neither of them (func or FUNCTION) is a macro; the preprocessor does not know the name of the current function" so you might not be able to use string pasting as in the check_oid example, but you can use it as an argument to printf %s.

389-ds-bot commented 4 years ago

Comment from mreynolds (@mreynolds389) at 2016-09-09 04:09:52

Replying to [comment:19 richm]:

You can use __func__ to get the name of the current function. Looks like ipa is using that in more recent code. See http://www.delorie.com/gnu/docs/gcc/cpp_21.html

e.g. instead of

        LDAPDebug(LDAP_DEBUG_ANY, LOG_ERR, "check_oid - Non numeric oid %s passed to check_oid\n",oid,0,0); 
        slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, "slapi_be_setentrypoint", 
  580                 "Unknown entry point %d\n", entrypoint); 

use

         LDAPDebug(LDAP_DEBUG_ANY, LOG_ERR, __func__ " - Non numeric oid %s passed to " __func__ "\n",oid,0,0); 
       slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, __func__, ...

however, it says that "... Neither of them (func or FUNCTION) is a macro; the preprocessor does not know the name of the current function" so you might not be able to use string pasting as in the check_oid example, but you can use it as an argument to printf %s.

That's actually very cool, and maintenance free... However, redoing this patch might consume another day of work. Should I redo it, or just use this technique on the next round of patches?

389-ds-bot commented 4 years ago

Comment from firstyear (@Firstyear) at 2016-09-09 04:45:35

You can use func to get the name of the current function. Looks like ipa is using that in more recent code. See ​http://www.delorie.com/gnu/docs/gcc/cpp_21.html

I don't think this is a good idea to do at this point. func will likely involve some stack magic or something, so I could see it being costly...

Everyone, please ignore this example below - this is will never happen.

slapi_log_error(SLAPI_LOG_REPL, LOG_ERROR, repl_subsystem, "message\n");

In the patch I see

   slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, "process_bulk_import_op", "NULL entry\n"); 

So what's the point of the LOG_ERR? We already have the message at FATAL, we are just "doubly" expresing fatal.

The point of this patch apparently is to make it when you say "LOG_ERR" that it puts the word "ERROR" at the start of the line.

Why not just infer that from the SLAPILOG* type? It would save changing the api, and it would have the same effect? It seems like we are double handling here ....

389-ds-bot commented 4 years ago

Comment from rmeggins (@richm) at 2016-09-09 05:13:53

Replying to [comment:21 Firstyear]:

You can use func to get the name of the current function. Looks like ipa is using that in more recent code. See ​http://www.delorie.com/gnu/docs/gcc/cpp_21.html

I don't think this is a good idea to do at this point. func will likely involve some stack magic or something, so I could see it being costly...

Then let's quantify that cost, and weigh that cost vs. readability/maintainability.

Everyone, please ignore this example below - this is will never happen.

slapi_log_error(SLAPI_LOG_REPL, LOG_ERROR, repl_subsystem, "message\n");

In the patch I see

   slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, "process_bulk_import_op", "NULL entry\n"); 

So what's the point of the LOG_ERR? We already have the message at FATAL, we are just "doubly" expresing fatal.

The point of this patch apparently is to make it when you say "LOG_ERR" that it puts the word "ERROR" at the start of the line.

Why not just infer that from the SLAPILOG* type? It would save changing the api, and it would have the same effect? It seems like we are double handling here ....

389-ds-bot commented 4 years ago

Comment from rmeggins (@richm) at 2016-09-09 05:14:26

Replying to [comment:20 mreynolds389]:

Replying to [comment:19 richm]:

You can use __func__ to get the name of the current function. Looks like ipa is using that in more recent code. See http://www.delorie.com/gnu/docs/gcc/cpp_21.html

e.g. instead of

        LDAPDebug(LDAP_DEBUG_ANY, LOG_ERR, "check_oid - Non numeric oid %s passed to check_oid\n",oid,0,0); 
        slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, "slapi_be_setentrypoint", 
    580                 "Unknown entry point %d\n", entrypoint); 

use

         LDAPDebug(LDAP_DEBUG_ANY, LOG_ERR, __func__ " - Non numeric oid %s passed to " __func__ "\n",oid,0,0); 
       slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, __func__, ...

however, it says that "... Neither of them (func or FUNCTION) is a macro; the preprocessor does not know the name of the current function" so you might not be able to use string pasting as in the check_oid example, but you can use it as an argument to printf %s.

That's actually very cool, and maintenance free... However, redoing this patch might consume another day of work. Should I redo it, or just use this technique on the next round of patches?

Let's find out how much it actually costs.

389-ds-bot commented 4 years ago

Comment from nhosoi (@nhosoi) at 2016-09-09 05:54:47

Thank you so much for adding the severity, Mark! I really like to see LOG_EMERG, LOG_ALERT, LOG_CRIT, etc. along with the messages.

I have some requests like this. These functions are in #ifdef FOR_DEBUGGING, better be LOG_DEBUG instead of LOG_ERR?

connection_table_dump_active_connection (Connection *c)
connection_table_dump_active_connections (Connection_Table *ct)

ditto...

248 248 #ifdef FOR_DEBUGGING 
249         slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, "connection", "Moving connection out of active list\n"); 
    249     slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, "connection_table_move_connection_out_of_active_list", "Moving connection out of active list\n"); 
250 250     connection_table_dump_active_connection (c); 
251 251 #endif 

ditto...

307 309 #ifdef FOR_DEBUGGING 
308         slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, "connection", "Moving connection into active list\n"); 
    310     slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, "connection_table_move_connection_on_to_active_list", 
    311         "Moving connection into active list\n"); 
309 312     connection_table_dump_active_connection (c); 
310 313 #endif 

These are even further fine tunings... Since you are still working on the big transition, should we wait for the task done? Thanks!

389-ds-bot commented 4 years ago

Comment from mreynolds (@mreynolds389) at 2016-09-09 18:36:03

Replying to [comment:21 Firstyear]:

So what's the point of the LOG_ERR? We already have the message at FATAL, we are just "doubly" expresing fatal.

The point of this patch apparently is to make it when you say "LOG_ERR" that it puts the word "ERROR" at the start of the line.

It's actually "ERR" after we use the syslog mappings.

see /usr/include/sys/syslog.h

Why not just infer that from the SLAPILOG* type? It would save changing the api, and it would have the same effect? It seems like we are double handling here ....

The whole point of all of this is that SLAPI_LOG_FATAL is too general, and there is no easy way sort/parse the errors log. We log info messages, warnings, errors, severe errors, and debug messages. all using SLAPI_LOG_FATAL. There is no easy way to determine what is serious and what is not. Customers get confused by this ALL the time. So is there a bit of double handling, of course there is, but the benefit far out weights it.

So I think you are implying that we should add new logging levels? The problem with that, is that they need to be set in the config and are not logged by default. We are also taking advantage of the syslog error codes, and moving towards a more standard way of logging errors.

389-ds-bot commented 4 years ago

Comment from mreynolds (@mreynolds389) at 2016-09-09 18:51:29

Replying to [comment:24 nhosoi]:

Thank you so much for adding the severity, Mark! I really like to see LOG_EMERG, LOG_ALERT, LOG_CRIT, etc. along with the messages.

I have some requests like this. These functions are in #ifdef FOR_DEBUGGING, better be LOG_DEBUG instead of LOG_ERR?

connection_table_dump_active_connection (Connection *c)
connection_table_dump_active_connections (Connection_Table *ct)

ditto...

248   248 #ifdef FOR_DEBUGGING 
249           slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, "connection", "Moving connection out of active list\n"); 
  249     slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, "connection_table_move_connection_out_of_active_list", "Moving connection out of active list\n"); 
250   250     connection_table_dump_active_connection (c); 
251   251 #endif 

ditto...

307   309 #ifdef FOR_DEBUGGING 
308           slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, "connection", "Moving connection into active list\n"); 
  310     slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, "connection_table_move_connection_on_to_active_list", 
  311         "Moving connection into active list\n"); 
309   312     connection_table_dump_active_connection (c); 
310   313 #endif 

These are even further fine tunings... Since you are still working on the big transition, should we wait for the task done? Thanks!

You're right they should be LOG_DEBUG I'll go back and up this patch, and make sure I follow this moving forward. Thanks.

389-ds-bot commented 4 years ago

Comment from mreynolds (@mreynolds389) at 2016-09-09 19:31:56

Replying to [comment:19 richm]:

You can use __func__ to get the name of the current function. Looks like ipa is using that in more recent code. See http://www.delorie.com/gnu/docs/gcc/cpp_21.html

e.g. instead of

        LDAPDebug(LDAP_DEBUG_ANY, LOG_ERR, "check_oid - Non numeric oid %s passed to check_oid\n",oid,0,0); 
        slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, "slapi_be_setentrypoint", 
  580                 "Unknown entry point %d\n", entrypoint); 

use

         LDAPDebug(LDAP_DEBUG_ANY, LOG_ERR, __func__ " - Non numeric oid %s passed to " __func__ "\n",oid,0,0); 
       slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, __func__, ...

however, it says that "... Neither of them (func or FUNCTION) is a macro; the preprocessor does not know the name of the current function" so you might not be able to use string pasting as in the check_oid example, but you can use it as an argument to printf %s.

Okay, so it only works using it as a printf arg, and it only works with LDAPDebug().

LDAPDebug(LDAP_DEBUG_ANY, LOG_INFO, "%s - Message...\n", __func__, 0, 0);

It does not work in slapi_log_error() for the subsystem. So if we implemented this we would have inconsistencies between the two logging functions. Plus this adds an arg to the limited list of args for LDAPDebug(). Some calls already use up all the available args in LDAPDebug(). Based on these issues I'm not sure this is worth implementing after all.

389-ds-bot commented 4 years ago

Comment from mreynolds (@mreynolds389) at 2016-09-09 19:36:24

Replying to [comment:4 richm]:

Replying to [comment:3 mreynolds389]:

Refractoring tasks:

  • LDAPDebug() needs to be replaced with slapi_log_error()

This will remove the ability to compile without those logging statements. This may have performance implications.

Can you elaborate on this? Wouldn't this just impact the compile time (not the run time)?

It would just be nice to have a single logging function, and slapi_log_error() is much better than LDAPDebug().

389-ds-bot commented 4 years ago

Comment from mreynolds (@mreynolds389) at 2016-09-09 20:14:48

New patch attached. Applied Noriko's requests...

389-ds-bot commented 4 years ago

Comment from rmeggins (@richm) at 2016-09-10 23:47:08

Replying to [comment:27 mreynolds389]:

Replying to [comment:19 richm]:

You can use __func__ to get the name of the current function. Looks like ipa is using that in more recent code. See http://www.delorie.com/gnu/docs/gcc/cpp_21.html

e.g. instead of

        LDAPDebug(LDAP_DEBUG_ANY, LOG_ERR, "check_oid - Non numeric oid %s passed to check_oid\n",oid,0,0); 
        slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, "slapi_be_setentrypoint", 
    580                 "Unknown entry point %d\n", entrypoint); 

use

         LDAPDebug(LDAP_DEBUG_ANY, LOG_ERR, __func__ " - Non numeric oid %s passed to " __func__ "\n",oid,0,0); 
       slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, __func__, ...

however, it says that "... Neither of them (func or FUNCTION) is a macro; the preprocessor does not know the name of the current function" so you might not be able to use string pasting as in the check_oid example, but you can use it as an argument to printf %s.

Okay, so it only works using it as a printf arg, and it only works with LDAPDebug().

LDAPDebug(LDAP_DEBUG_ANY, LOG_INFO, "%s - Message...\n", __func__, 0, 0);

It does not work in slapi_log_error() for the subsystem.

That's really strange. freeipa is using it in a very similar fashion. Why doesn't it work?

So if we implemented this we would have inconsistencies between the two logging functions. Plus this adds an arg to the limited list of args for LDAPDebug(). Some calls already use up all the available args in LDAPDebug(). Based on these issues I'm not sure this is worth implementing after all.

We would have to add another macro (e.g. LDAPDebugFunc) with an argument just for this.

389-ds-bot commented 4 years ago

Comment from rmeggins (@richm) at 2016-09-10 23:52:03

Replying to [comment:28 mreynolds389]:

Replying to [comment:4 richm]:

Replying to [comment:3 mreynolds389]:

Refractoring tasks:

  • LDAPDebug() needs to be replaced with slapi_log_error()

This will remove the ability to compile without those logging statements. This may have performance implications.

Can you elaborate on this? Wouldn't this just impact the compile time (not the run time)?

If LDAP_DEBUG is not defined at compile time, the LDAPDebug* macros will evaluate to this:

/* debugging stuff */
/* Disable by default */
#define LDAPDebug( level, fmt, arg1, arg2, arg3 )
#define LDAPDebugLevelIsSet( level ) (0)
#define LDAPDebug0Args( level, fmt )
#define LDAPDebug1Arg( level, fmt, arg )
#define LDAPDebug2Args( level, fmt, arg1, arg2 )

That is, they will evaluate to nothing. Which means they won't be called at all at runtime, because there is no code. If we have some very performance sensitive code that we want to test, we can undefine LDAP_DEBUG to completely eliminate calls to LDAPDebug*.

It would just be nice to have a single logging function, and slapi_log_error() is much better than LDAPDebug().

I agree. It is confusing to have both LDAPDebug and slapi_log_error.

389-ds-bot commented 4 years ago

Comment from mreynolds (@mreynolds389) at 2016-09-11 01:45:52

Replying to [comment:31 richm]:

  • LDAPDebug() needs to be replaced with slapi_log_error()

This will remove the ability to compile without those logging statements. This may have performance implications.

Can you elaborate on this? Wouldn't this just impact the compile time (not the run time)?

If LDAP_DEBUG is not defined at compile time, the LDAPDebug* macros will evaluate to this:

/* debugging stuff */
/* Disable by default */
#define LDAPDebug( level, fmt, arg1, arg2, arg3 )
#define LDAPDebugLevelIsSet( level ) (0)
#define LDAPDebug0Args( level, fmt )
#define LDAPDebug1Arg( level, fmt, arg )
#define LDAPDebug2Args( level, fmt, arg1, arg2 )

That is, they will evaluate to nothing. Which means they won't be called at all at runtime, because there is no code. If we have some very performance sensitive code that we want to test, we can undefine LDAP_DEBUG to completely eliminate calls to LDAPDebug*.

Ahhh okay. We can add a DEFINE to slapi_log_error() to accomplish the same thing (except it will impact all logging). If this is okay, I'd like to replace the LDAPDebug, and update slapi_log_error with a DISABLE_LOGGING define.

389-ds-bot commented 4 years ago

Comment from mreynolds (@mreynolds389) at 2016-09-11 02:50:09

Replying to [comment:30 richm]:

It does not work in slapi_log_error() for the subsystem.

That's really strange. freeipa is using it in a very similar fashion. Why doesn't it work?

It does work, I got a compiler warning(char vs const char), but I originally thought it was an error.

Lets create a new ticket for this to be investigated. I am also dropping the new error log severity message logging level for now. That can be another ticket if it's deemed useful to have such a new logging level - as it is now very easy just to grep the errors log for the new severity levels.

389-ds-bot commented 4 years ago

Comment from rmeggins (@richm) at 2016-09-11 05:02:18

Replying to [comment:32 mreynolds389]:

Replying to [comment:31 richm]:

  • LDAPDebug() needs to be replaced with slapi_log_error()

This will remove the ability to compile without those logging statements. This may have performance implications.

Can you elaborate on this? Wouldn't this just impact the compile time (not the run time)?

If LDAP_DEBUG is not defined at compile time, the LDAPDebug* macros will evaluate to this:

/* debugging stuff */
/* Disable by default */
#define LDAPDebug( level, fmt, arg1, arg2, arg3 )
#define LDAPDebugLevelIsSet( level ) (0)
#define LDAPDebug0Args( level, fmt )
#define LDAPDebug1Arg( level, fmt, arg )
#define LDAPDebug2Args( level, fmt, arg1, arg2 )

That is, they will evaluate to nothing. Which means they won't be called at all at runtime, because there is no code. If we have some very performance sensitive code that we want to test, we can undefine LDAP_DEBUG to completely eliminate calls to LDAPDebug*.

Ahhh okay. We can add a DEFINE to slapi_log_error() to accomplish the same thing (except it will impact all logging). If this is okay, I'd like to replace the LDAPDebug, and update slapi_log_error with a DISABLE_LOGGING define.

If that works. In general, you can't have a #define replace a varadic function. You have to do something like this:

#if LDAP_DEBUG
#define slapi_log_error((x))
#else
#define slapi_log_error((x)) slapi_log_error_real((x))
#endif
...

    slapi_log_error((SLAPI_LOG_FATAL, "subsystem", "%s %s ..."));

The extra open and close () are significant.

It's possible there is now a C/gcc standard or extension that allows you to #define a varadic function.

389-ds-bot commented 4 years ago

Comment from rmeggins (@richm) at 2016-09-11 05:02:40

Replying to [comment:33 mreynolds389]:

Replying to [comment:30 richm]:

It does not work in slapi_log_error() for the subsystem.

That's really strange. freeipa is using it in a very similar fashion. Why doesn't it work?

It does work, I got a compiler warning(char vs const char), but I originally thought it was an error.

Lets create a new ticket for this to be investigated. I am also dropping the new error log severity message logging level for now. That can be another ticket if it's deemed useful to have such a new logging level - as it is now very easy just to grep the errors log for the new severity levels.

+1

389-ds-bot commented 4 years ago

Comment from mreynolds (@mreynolds389) at 2016-09-11 18:01:43

Replying to [comment:34 richm]:

Replying to [comment:32 mreynolds389]:

Replying to [comment:31 richm]:

  • LDAPDebug() needs to be replaced with slapi_log_error()

This will remove the ability to compile without those logging statements. This may have performance implications.

Can you elaborate on this? Wouldn't this just impact the compile time (not the run time)?

If LDAP_DEBUG is not defined at compile time, the LDAPDebug* macros will evaluate to this:

/* debugging stuff */
/* Disable by default */
#define LDAPDebug( level, fmt, arg1, arg2, arg3 )
#define LDAPDebugLevelIsSet( level ) (0)
#define LDAPDebug0Args( level, fmt )
#define LDAPDebug1Arg( level, fmt, arg )
#define LDAPDebug2Args( level, fmt, arg1, arg2 )

That is, they will evaluate to nothing. Which means they won't be called at all at runtime, because there is no code. If we have some very performance sensitive code that we want to test, we can undefine LDAP_DEBUG to completely eliminate calls to LDAPDebug*.

Ahhh okay. We can add a DEFINE to slapi_log_error() to accomplish the same thing (except it will impact all logging). If this is okay, I'd like to replace the LDAPDebug, and update slapi_log_error with a DISABLE_LOGGING define.

If that works. In general, you can't have a #define replace a varadic function. You have to do something like this:

#if LDAP_DEBUG
#define slapi_log_error((x))
#else
#define slapi_log_error((x)) slapi_log_error_real((x))
#endif
...

    slapi_log_error((SLAPI_LOG_FATAL, "subsystem", "%s %s ..."));

The extra open and close () are significant.

It's possible there is now a C/gcc standard or extension that allows you to #define a varadic function.

Hmmm, I haven't tried this, but I thought we could do something like this:

slapi_log_error( int loglevel, int severity, char *subsystem, char *fmt, ... )
{
#if defined LOGGING_ENABLED
    /* all the logging code */
#endif
    return;
}

But this is still making a function call - not quite the same thing as the LDAPDebug macro. I'll do some research on the varadic function in C/gcc...

389-ds-bot commented 4 years ago

Comment from firstyear (@Firstyear) at 2016-09-12 04:56:54

Replying to [comment:25 mreynolds389]:

Replying to [comment:21 Firstyear]:

So what's the point of the LOG_ERR? We already have the message at FATAL, we are just "doubly" expresing fatal.

The point of this patch apparently is to make it when you say "LOG_ERR" that it puts the word "ERROR" at the start of the line.

It's actually "ERR" after we use the syslog mappings.

see /usr/include/sys/syslog.h

So why not have a table to map SLAPI_LOG_type -> syslog type? That would be better here.

Why not just infer that from the SLAPILOG* type? It would save changing the api, and it would have the same effect? It seems like we are double handling here ....

The whole point of all of this is that SLAPI_LOG_FATAL is too general, and there is no easy way sort/parse the errors log. We log info messages, warnings, errors, severe errors, and debug messages. all using SLAPI_LOG_FATAL. There is no easy way to determine what is serious and what is not. Customers get confused by this ALL the time. So is there a bit of double handling, of course there is, but the benefit far out weights it.

So I think you are implying that we should add new logging levels? The problem with that, is that they need to be set in the config and are not logged by default. We are also taking advantage of the syslog error codes, and moving towards a more standard way of logging errors.

Yes, that is my point. What is the problem we are trying to solve here: That we use SLAPI_LOG_FATAL too much because we don't have the log levels we need to express our messages, and that the default level is incorrect even if we had the other levels.

Instead, lets not make it more complex, we have to bite the bullet and just fix the default level, and the available log levels. I think we are actually moving to a more complex scenario. For example, at default log level, what happens here:

    SLAPI_LOG_PLUGIN, LOG_DEBUG
    SLAPI_LOG_PLUGIN, LOG_ERR
    SLAPI_LOG_FATAL, LOG_INFO
    SLAPI_LOG_FATAL, LOG_ERR

The fact that I have to really question what will happen to these messages, based on the combination of both flags is the immediate warning that this is the wrong approach.

So, I would suggest we add some new levels such as:

SLAPI_LOG_DEBUG
SLAPI_LOG_INFO
SLAPI_LOG_WARNING

And that we make the default error level set to show INFO, WARNING, and FATAL out of the box.

Some of the comments in this thread indicate some question about this, and perhaps backing out I'm not sure. Either way, I'm still not satisfied with this approach I'm sorry :(

389-ds-bot commented 4 years ago

Comment from rmeggins (@richm) at 2016-09-12 08:21:37

If you just use slapi_log_error as a function, any arguments you pass in will still be evaluated. e.g. if you do something like this:

slapi_log_error(SLAPI_LOG_FATAL, something, "%d\n", expensive_function_that_returns_an_int());

The expensive function will be evaluated regardless. The nice thing about a macro like LDAPDebug is that you can completely compile out this code.

I'll do some research on the varadic function in C/gcc...

https://en.wikipedia.org/wiki/Variadic_macro

389-ds-bot commented 4 years ago

Comment from mreynolds (@mreynolds389) at 2016-09-12 08:34:15

Replying to [comment:37 Firstyear]:

Some of the comments in this thread indicate some question about this, and perhaps backing out I'm not sure. Either way, I'm still not satisfied with this approach I'm sorry :(

I don't think you've been following/aware of what's going on here. I will recap what is going on, but everyone else is on board with the changes that are being made (unless something changed over the weekend that I'm not aware of).

First, we are NOT doing any severity level log filtering. No new configurable log level is going to be used, or logging behavior changed. It isn't really needed (not yet). What this ticket is doing is just standardizing the logging format and improving error classification. The logging is going to look like:

<timestamp> - <log level> - <plugin/function(aka subsystem)> - <message>

That's all that is going on here. This will finally allow for some basic parsing of the errors log, and clarity of what the message really means. The log levels are defined in syslog, and I have used every level in my work so far. Here are some examples:

[09/Sep/2016:11:54:42.416337604 -0400] - INFO - ldbm_back_ldbm2index - userroot: Indexed 50000 entries (99%).
[09/Sep/2016:11:54:42.430673504 -0400] - INFO - ldbm_back_ldbm2index - userroot: Finished indexing.
[10/Sep/2016:23:46:07.624057716 -0400] - NOTICE - ldbm_back_start - userroot: entry cache size 10485760 B is less than db size 102645760 B; We recommend to increase the entry cache size nsslapd-cachememsize.

Plugins will be:

[10/Sep/2016:23:46:07.624057716 -0400] - ERR - NSMMReplicationPlugin - replica_config_modify: replica already exists
[10/Sep/2016:23:46:07.624057717 -0400] - INFO - NSMMReplicationPlugin - cleanruv_task: cleaning rid (99)...

So, the only thing I did in the code was add a sev_level to the logging functions, and then cleaned up/standardized all the logging messages. I'm also like 75% complete in this task.

Moving forward I will be looking into replacing LDAPDebug as well, but this phase needs to be completed first. But, if all of this still really bothers you then please bring it up with Nathan.

389-ds-bot commented 4 years ago

Comment from firstyear (@Firstyear) at 2016-09-14 04:56:00

Replying to [comment:38 richm]:

If you just use slapi_log_error as a function, any arguments you pass in will still be evaluated. e.g. if you do something like this:

slapi_log_error(SLAPI_LOG_FATAL, something, "%d\n", expensive_function_that_returns_an_int());

The expensive function will be evaluated regardless. The nice thing about a macro like LDAPDebug is that you can completely compile out this code.

I'll do some research on the varadic function in C/gcc...

https://en.wikipedia.org/wiki/Variadic_macro

I don't want to add more macros for this. Macros cause headaches with linking to other langs(which I am especially interested in): You basically make a function which can never be exported or used, but has huge useful purpose. I would like to avoid this please.

389-ds-bot commented 4 years ago

Comment from rmeggins (@richm) at 2016-09-14 04:59:46

Replying to [comment:40 Firstyear]:

Replying to [comment:38 richm]:

If you just use slapi_log_error as a function, any arguments you pass in will still be evaluated. e.g. if you do something like this:

slapi_log_error(SLAPI_LOG_FATAL, something, "%d\n", expensive_function_that_returns_an_int());

The expensive function will be evaluated regardless. The nice thing about a macro like LDAPDebug is that you can completely compile out this code.

I'll do some research on the varadic function in C/gcc...

https://en.wikipedia.org/wiki/Variadic_macro

I don't want to add more macros for this. Macros cause headaches with linking to other langs(which I am especially interested in): You basically make a function which can never be exported or used, but has huge useful purpose. I would like to avoid this please.

If you don't want to add more macros, what do you want to do with the nice-to-have, possibly critical-to-have-in-some-situations LDAPDebug* macros, that can't be used if we add/change the logging functions?

389-ds-bot commented 4 years ago

Comment from mreynolds (@mreynolds389) at 2016-09-20 22:27:56

There is still more to do. Need to investigate the removal of LDAPDebug, etc.

389-ds-bot commented 4 years ago

Comment from nhosoi (@nhosoi) at 2016-09-21 01:26:26

Mark,

Your patch looks good. I compared with 11a5b1e412cb77a752366e352cb9d203c52b108e, which is just before the first patch for this ticket. It gave me pretty clean diffs that mainly contains refined log level and messages with the new definitions/logging mechanisms. (Also, I believe the message part has been already reviewed in the previous patches.)

I'd like to ask one request on ssl.c. In the file, for some reason, 5 messages start with "Warning:" which were automatically set to SLAPI_LOG_WARNING. I'd think they are all errors. Could you please replace SLAPI_LOG_WARNING with SLAPI_LOG_ERR?

These look rather SLAPI_LOG_CRIT to me... :)

+    slapi_log_error(SLAPI_LOG_WARNING, "Security Initialization",
+        "warn_if_no_cert_file: certificate DB file cert8.db nor cert7.db exists in [%s] - "
         "SSL initialization will likely fail\n", dir);

+    slapi_log_error(SLAPI_LOG_WARNING, "Security Initialization",
+         "warn_if_no_key_file: key DB file %s does not exist - SSL initialization will "
           "likely fail\n", filename);

+    slapi_log_error(SLAPI_LOG_WARNING, "Security Initialization",
+        "slapd_nss_init: The key/cert database directory [%s] is not writable by "
         "the server uid [%s]: initialization likely to fail.\n",
         certdir, serveruser);

Other than that, you have my ack.

389-ds-bot commented 4 years ago

Comment from mreynolds (@mreynolds389) at 2016-09-21 02:26:20

final refractor 0001-Ticket-48978-Update-error-logging-with-new-codes.patch

389-ds-bot commented 4 years ago

Comment from mreynolds (@mreynolds389) at 2016-09-21 02:28:16

Replying to [comment:43 nhosoi]:

I'd like to ask one request on ssl.c. In the file, for some reason, 5 messages start with "Warning:" which were automatically set to SLAPI_LOG_WARNING. I'd think they are all errors. Could you please replace SLAPI_LOG_WARNING with SLAPI_LOG_ERR?

Done, new patch attached!

389-ds-bot commented 4 years ago

Comment from nhosoi (@nhosoi) at 2016-09-21 02:42:43

Great! Thank you so much, Mark! The new log looks really cool:

[..] - INFO - slapd_extract_cert - CA CERT NAME: CA certificate
[..] - WARN - SSL alert: Sending pin request to SVRCore. You may need to run systemd-tty-ask-password-agent to provide the password.
[..] - INFO - slapd_extract_cert - SERVER CERT NAME: Server-Cert1
[..] - INFO - SSL info: Configured NSS Ciphers
[..] - INFO - SSL info:     TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256: enabled
[..] - INFO - SSL info:     TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256: enabled
    ........................
[..] - INFO - Security Initialization - slapd_ssl_init2 - Configured SSL version range: min: TLS1.1, max: TLS1.2
[..] - INFO - 389-Directory/1.3.6.0.20160920git73635f5 B2016.264.2035 starting up
[..] - INFO - slapd started.  Listening on All Interfaces port 389 for LDAP requests
[..] - INFO - Listening on All Interfaces port 10637 for LDAPS requests

Do you need some more acks to push, Mark? I'm leaving the Review status to review? for now...

389-ds-bot commented 4 years ago

Comment from mreynolds (@mreynolds389) at 2016-09-21 02:44:59

Replying to [comment:45 nhosoi]:

Do you need some more acks to push, Mark? I'm leaving the Review status to review? for now...

I don't think so, yours would be great :-)

389-ds-bot commented 4 years ago

Comment from nhosoi (@nhosoi) at 2016-09-21 03:22:01

Ok, then... :)

389-ds-bot commented 4 years ago

Comment from mreynolds (@mreynolds389) at 2016-09-21 03:34:16

2b45d58..fb9b9ef master -> master commit fb9b9ef0a74f64abd89e52c5c90fe7dfcea6ba71 Author: Mark Reynolds mreynolds389@redhat.com Date: Tue Sep 20 16:25:51 2016 -0400