odpi / egeria

Egeria core
https://egeria-project.org
Apache License 2.0
806 stars 261 forks source link

Exceptions being captured without auditing or error logging #1111

Closed cmgrote closed 4 years ago

cmgrote commented 5 years ago

There still seem to be areas where exceptions are being captured, and either not audit-logged / logged at all, or logged only as debug messages (without the full stack trace).

Example 1: trying to startup an connector that itself has some additional dependencies, when those additional dependencies are not also included in the LOADER_PATH you'll end up with a NoClassDefFoundError -- which OCFErrorCode appears to capture and only outputs as a debug message:

2019-06-07 09:58:12.078 DEBUG 13043 --- [nio-8080-exec-9] o.o.o.f.connectors.ffdc.OCFErrorCode     : OCFErrorCode.getMessage([org.odpi.egeria.connectors.ibm.igc.repositoryconnector.IGCOMRSRepositoryConnector, MetadataRepositoryNative.Connection.myserver]): Invalid Connector class org.odpi.egeria.connectors.ibm.igc.repositoryconnector.IGCOMRSRepositoryConnector
2019-06-07 09:58:12.079 DEBUG 13043 --- [nio-8080-exec-9] o.o.o.f.c.f.ConnectionCheckedException   : 500, org.odpi.egeria.connectors.ibm.igc.repositoryconnector.IGCOMRSRepositoryConnectorProvider, getConnector, java.lang.NoClassDefFoundError: org/odpi/egeria/connectors/ibm/igc/clientlibrary/IGCRestClient

This makes troubleshooting problems particularly challenging 😉

mandy-chessell commented 5 years ago

OCFErrorCode is an enum that defines message structures. It does not capture exceptions - it puts out a debug log message whenever it is called to format an error message.

These types of errors are captured by the ConnectorBroker in exceptions. The exceptions include a detailed message generated by OCFErrorCode.

I am only guessing at the use case here as there is not much detail. However, I looked at the OMRS initialization logic and discovered the following ...

When the exception happens in one of the connectors used by the OMRS (such as the OMRS Repository Connector) then the exception is caught and wrapped in an OMRSConfigErrorException which is returned to caller. Unfortunately, for security reasons, the nested exception stack is not returned from the server so the detailed message is lost.

I have added an audit log message to each of the catch blocks in the methods that initialize the OMRS connectors. This should give some detail as to the original cause. However, this may not cover the use case referred to above.

mandy-chessell commented 5 years ago

I think this is fixed for the OMRS but will review other components to determine if a more general issue.