apache / accumulo

Apache Accumulo
https://accumulo.apache.org
Apache License 2.0
1.07k stars 445 forks source link

remove problems reports #5019

Open keith-turner opened 3 weeks ago

keith-turner commented 3 weeks ago

Removes problem reports and replaces them with log messages that identifies the file and table that had the error.

keith-turner commented 2 weeks ago

No issue with these changes. But I'm wondering if all of the log messages that occur next to the removed ProblemReport should all go to a centralized Logger, including the ProblemReportingIterator.

I will take a pass at improving the log messages logger, may be able to use the table files logger in some cases.

ddanielr commented 2 weeks ago

No issue with these changes. But I'm wondering if all of the log messages that occur next to the removed ProblemReport should all go to a centralized Logger, including the ProblemReportingIterator.

I will take a pass at improving the log messages logger, may be able to use the table files logger in some cases.

We've discussed having an "event" type logger. Not sure if that makes things simpler vs having a dedicated logger for problems. Using MDC (Mapped Diagnostic Context) might make this easier. We currently use slf4j which has a noop MDC https://www.slf4j.org/api/org/slf4j/MDC.html but I believe both log4j and logback support MDC.

ctubbsii commented 2 weeks ago

Using MDC (Mapped Diagnostic Context) might make this easier

I don't think it would make it easier. slf4j, log4j 1.2, and log4j2 all have different thread-local context concepts, and it requires a converter to switch between them. It really introduces a lot of complexity, from using more of the slf4j API rather than treating it as a simple logging facade, to adding extra configuration and possibly code to convert between them, and modifying the logging templates to include the map context. It also forces developers to think about thread-local views of the context, every time you want to include thread-local context in a log message, and is likely going to lead to a lot of programmer errors with regard to logging (like not realizing that a log message originating from a class executed in an executor won't inherit the context, when refactoring code that logs).

It'd be much easier to just avoid MDC and ThreadLocalContext for logging, and limit our use of slf4j to a very simple logging facade for the basic log level methods logger.info(), logger.error(), etc. If we want to log additional contextual details, it's much easier to maintain if we just pass around the contextual data we want to log, and include it directly in the message string. MDC/ThreadLocalContext is really best used for general-purpose context information you want to include in every log message, but aren't directly related to what is being logged (like the client IP address and username associated with a request, in a server-side error message that handles the request - not directly related to the error, but perhaps useful to know who is triggering it or who is impacted by it). I don't think that's quite the use case we have here.

keith-turner commented 2 weeks ago

No issue with these changes. But I'm wondering if all of the log messages that occur next to the removed ProblemReport should all go to a centralized Logger, including the ProblemReportingIterator.

I will take a pass at improving the log messages logger, may be able to use the table files logger in some cases.

@dlmarion updated the logging to use some of the existing named loggers for tablet events in 111a7e5

keith-turner commented 2 weeks ago

I would like to apply this change to the 3.1 branch, and I think we should add a notice about it being removed in the 2.1.4 release.

@ctubbsii if you want to apply this change to the 3.1 branch that is fine with me. I am finished making changes with this branch so feel free to do whatever you think is best to get these changes to 3.1.

ctubbsii commented 2 weeks ago

I would like to apply this change to the 3.1 branch, and I think we should add a notice about it being removed in the 2.1.4 release.

@ctubbsii if you want to apply this change to the 3.1 branch that is fine with me. I am finished making changes with this branch so feel free to do whatever you think is best to get these changes to 3.1.

Okay. Please hold off on merging it, then. I'll rebase it onto the 3.1 branch, and move the upgrade code to the 3.1 upgrade code. The notice on the 2.1 monitor can be added in a subsequent change. But, I don't want this merged, because the target upgrade code will be different, and it will cause unnecessary conflicts if merged to main first.