kieker-monitoring / kieker

Kieker's main repository
Apache License 2.0
70 stars 41 forks source link

[KIEKER-839] Analysis framework: Java Errors ignored when delivering records #995

Open rju opened 1 week ago

rju commented 1 week ago

JIRA Issue: KIEKER-839 Analysis framework: Java Errors ignored when delivering records Original Reporter: Andre van Hoorn


Filter A delivers a record to filter B. A Java Error occurs (e.g., java.lang.OutOfMemoryError). This error is written to the log. However, the analysis still terminates with success status. See for example, KIEKER-821 Done

Note that the analysis should not terminate in each case a filter generates an exception (e.g., NPE). However, errors like OutOfMemoryError have an impact on the whole system.

Checklist:

rju commented 3 days ago

author nils-christian -- Sat, 20 Apr 2013 09:41:43 +0200

What would be the recommended action to perform in the case of a java.lang.Error? Can one still perform an emergency shutdown of the whole analysis in the case of an OutOfMemoryError? It is not really possible/likely to recover from that one, right?

rju commented 3 days ago

author Jan Waller -- Sat, 20 Apr 2013 09:51:52 +0200

A log and then throw, I guess.

rju commented 3 days ago

author nils-christian -- Sat, 20 Apr 2013 19:21:41 +0200

Replying to [jwa|comment:2]:
> A log and then throw, I guess.

This is technically currently the case
- The deliver method throws the error already. It depends on the calling method what happens next (which can also be
- let us assume a reader
- in another thread than the reader thread and is therefore not locatable by the AC).

Maybe: Log it and send the AC a signal to shut the analysis down? Or log it and terminate the VM (which would be some kind of ugly steamroller tactics though).

rju commented 3 days ago

author Jan Waller -- Sun, 21 Apr 2013 17:01:41 +0200

Ok, what should happen on (unrecoverable) exception, i.e., errors:

rju commented 3 days ago

author nils-christian -- Sun, 21 Apr 2013 17:06:52 +0200

Sounds good to me.

Remaining problem: IProjectContext (as it is available for the AbstractPlugin) doesn't have a terminate method. A cast to AnalysisController would be necessary
- but could fail if anyone decides to use something else than the AC as project context (just theoretically...)

rju commented 3 days ago

author nils-christian -- Sun, 21 Apr 2013 17:08:29 +0200

Cancel that. The components check this in the constructor.

rju commented 3 days ago

author Jan Waller -- Tue, 7 May 2013 13:11:20 +0200

This behavior should already be the case!
Can we reproduce the problem somehow on a small example?

rju commented 3 days ago

author nils-christian -- Tue, 7 May 2013 13:43:39 +0200

Replying to [jwa|comment:7]:
> This behavior should already be the case!
> Can we reproduce the problem somehow on a small example?

final IAnalysisController ac = new AnalysisController();

final TimeReader tr = new TimeReader(new Configuration(), ac);
final AbstractFilterPlugin fp = new AbstractFilterPlugin(new Configuration(), ac) {

    InputPort(name = "in", eventTypes = Object.class)
    public void in(final Object data) {
        System.out.println(data);
        throw new OutOfMemoryError();
    }

    Override
    public Configuration getCurrentConfiguration() {
        return new Configuration();
    }
};

ac.connect(tr, TimeReader.OUTPUT_PORT_NAME_TIMESTAMPS, fp, "in");

ac.run();

A first element is sent to the filter. Nothing is logged and the analysis doesn't seem to be terminated.

rju commented 3 days ago

author Jan Waller -- Tue, 7 May 2013 14:18:57 +0200

Ok, can you reproduce this problem with another reader?

TimeReader uses an SchedulesThreadPoolExecutor, which gobbles all errors up. So the bug would be with the reader.

rju commented 3 days ago

author nils-christian -- Tue, 7 May 2013 14:23:07 +0200

Replying to [jwa|comment:9]:
> Ok, can you reproduce this problem with another reader?
>
> TimeReader uses an SchedulesThreadPoolExecutor, which gobbles all errors up. So the bug would be with the reader.

Currently not, no. That explains the problem. But I guess this isn't a desired behavior of the TimeReader either, right?

rju commented 3 days ago

author Jan Waller -- Tue, 7 May 2013 14:57:06 +0200

I fixed the TimeReader in fe0a99127a8a87a227ce23da33e62c33162d2dbb

Please take a look and test.
Possible problem: If the terminate of the reader is executed by the thread that is started by the pool (so within the triggered analysis), it does not shutdown correctly.

rju commented 3 days ago

author Jan Waller -- Tue, 7 May 2013 15:06:59 +0200

ok, the last point should be fixed with d3dde983acf7f535bc2aa6aae91dc4327544d539

Is the rest of the bug still valid? Or was it only a problem with the TimeReader?

rju commented 3 days ago

author André van Hoorn -- Tue, 7 May 2013 15:09:08 +0200

We definitely need a JUnit test for this.

rju commented 3 days ago

author Jan Waller -- Tue, 7 May 2013 15:42:04 +0200

yet another fix in 81beec33bf8a9ea75ad166529fe539496e8c3885

rju commented 3 days ago

author Jan Waller -- Mon, 27 May 2013 11:29:53 +0200

Replying to [avh|comment:13]:
> We definitely need a JUnit test for this.

What would you want to test with jUnit in this case?

In my opinion this can be closed fixed.

rju commented 3 days ago

author André van Hoorn -- Mon, 3 Jun 2013 17:25:13 +0200

Need minimal example (first try to reproduce KIEKER-821 Done ).

rju commented 3 days ago

author nils-christian -- Wed, 5 Jun 2013 20:54:25 +0200

Tried to reproduce KIEKER-821 Done . Result was:
- No log entry
- Normal analysis outputs and then an exception

Exception was

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOf(Unknown Source)
        at java.lang.AbstractStringBuilder.expandCapacity(Unknown Source)
        at java.lang.AbstractStringBuilder.ensureCapacityInternal(Unknown Source)
        at java.lang.AbstractStringBuilder.append(Unknown Source)
        at java.lang.StringBuilder.append(Unknown Source)
        at java.lang.StringBuilder.append(Unknown Source)
        at kieker.analysis.plugin.filter.flow.TraceEventRecords.toString(TraceEventRecords.java:74)
        at java.lang.String.valueOf(Unknown Source)
        at java.lang.StringBuilder.append(Unknown Source)
        at kieker.tools.traceAnalysis.filter.flow.EventRecordTraceCounter.inputInvalidTrace(EventRecordTraceCounter.java:100)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.lang.reflect.Method.invoke(Unknown Source)
        at kieker.analysis.plugin.AbstractPlugin.deliver(AbstractPlugin.java:195)
        at kieker.analysis.plugin.filter.flow.EventRecordTraceReconstructionFilter.terminate(EventRecordTraceReconstructionFilter.java:222)
        at kieker.analysis.plugin.AbstractPlugin.shutdown(AbstractPlugin.java:549)
        at kieker.analysis.plugin.AbstractPlugin.shutdown(AbstractPlugin.java:546)
        at kieker.analysis.plugin.AbstractPlugin.shutdown(AbstractPlugin.java:546)
        at kieker.analysis.plugin.AbstractPlugin.shutdown(AbstractPlugin.java:556)
        at kieker.analysis.plugin.AbstractPlugin.shutdown(AbstractPlugin.java:556)
        at kieker.analysis.plugin.AbstractPlugin.shutdown(AbstractPlugin.java:556)
        at kieker.analysis.plugin.AbstractPlugin.shutdown(AbstractPlugin.java:556)
        at kieker.analysis.plugin.AbstractPlugin.shutdown(AbstractPlugin.java:556)
        at kieker.analysis.plugin.AbstractPlugin.shutdown(AbstractPlugin.java:556)
        at kieker.analysis.AnalysisController.terminate(AnalysisController.java:511)
        at kieker.analysis.AnalysisController.terminate(AnalysisController.java:489)
        at kieker.analysis.AnalysisController.run(AnalysisController.java:471)
        at kieker.tools.traceAnalysis.TraceAnalysisTool.dispatchTasks(TraceAnalysisTool.java:880)
        at kieker.tools.traceAnalysis.TraceAnalysisTool.main(TraceAnalysisTool.java:1033)

Guess this should not happen...

rju commented 3 days ago

author Jan Waller -- Wed, 5 Jun 2013 21:09:53 +0200

I guess this actually is what should happen

at kieker.analysis.AnalysisController.run(AnalysisController.java:471)
means we finished the analysis and shutdown (probably due to an earlier OutOfMemory error)

I guess the reader thread tries to log but failed due to yet another out of memory?
Then it finishes, thus freeing some memory until the next oom happens during the (orderly) shutdown of the analysis.

There is finally an actual bug or omission:
line 471 needs a try catch to log errors on shutdown.

rju commented 3 days ago

author nils-christian -- Wed, 5 Jun 2013 21:15:33 +0200

Replying to [jwa|comment:19]:
> I guess this actually is what should happen

I agree. I am not complaining about the call of the termination method. Just about the possibility that an error within the termination can result in a crash.

> There is finally an actual bug or omission:
> line 471 needs a try catch to log errors on shutdown.

Guess we should do this in the termination method. Technically the termination method can be called when the analysis setup failed as well.

Edit: Where a plugin could throw an error as well.

rju commented 3 days ago

author nils-christian -- Wed, 5 Jun 2013 21:25:52 +0200

I added a try-catch-block with 3df074c31d14c1320651cd2decfaebc36c9b6a6d. When the shutdown fails, the error (or the exception) is logged, the state is set to FAILED.

Seems to work. The trace analysis tool doesn't crash, the error is logged. Current log output:

Jun 05, 2013 9:19:23 PM kieker.analysis.plugin.reader.filesystem.FSDirectoryReader run
INFO: < Loading E:\kieker working copy\bin\in\kieker-20130307-170431472-UTC-000-Thread-1.dat
Jun 05, 2013 9:19:23 PM kieker.analysis.plugin.reader.filesystem.FSDirectoryReader run
INFO: < Loading E:\kieker working copy\bin\in\kieker-20130307-170437541-UTC-000-Thread-1.dat
Jun 05, 2013 9:19:24 PM kieker.analysis.plugin.reader.filesystem.FSDirectoryReader run
INFO: < Loading E:\kieker working copy\bin\in\kieker-20130307-170443980-UTC-000-Thread-1.dat
Jun 05, 2013 9:19:24 PM kieker.analysis.AnalysisController terminate
INFO: Terminating analysis.
Jun 05, 2013 9:19:24 PM kieker.analysis.plugin.reader.filesystem.FSReader terminate
INFO: Shutting down reader.
Jun 05, 2013 9:19:26 PM kieker.analysis.AnalysisController terminate
SEVERE: Error during shutdown.
java.lang.OutOfMemoryError: Java heap space
    at java.util.Arrays.copyOf(Unknown Source)
    at java.lang.AbstractStringBuilder.expandCapacity(Unknown Source)
    at java.lang.AbstractStringBuilder.ensureCapacityInternal(Unknown Source)
    at java.lang.AbstractStringBuilder.append(Unknown Source)
    at java.lang.StringBuilder.append(Unknown Source)
    at java.lang.StringBuilder.append(Unknown Source)
    at kieker.analysis.plugin.filter.flow.TraceEventRecords.toString(TraceEventRecords.java:74)
    at java.lang.String.valueOf(Unknown Source)
    at java.lang.StringBuilder.append(Unknown Source)
    at kieker.tools.traceAnalysis.filter.flow.EventRecordTraceCounter.inputInvalidTrace(EventRecordTraceCounter.java:100)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    at java.lang.reflect.Method.invoke(Unknown Source)
    at kieker.analysis.plugin.AbstractPlugin.deliver(AbstractPlugin.java:195)
    at kieker.analysis.plugin.filter.flow.EventRecordTraceReconstructionFilter.terminate(EventRecordTraceReconstructionFilter.java:222)
    at kieker.analysis.plugin.AbstractPlugin.shutdown(AbstractPlugin.java:549)
    at kieker.analysis.plugin.AbstractPlugin.shutdown(AbstractPlugin.java:546)
    at kieker.analysis.plugin.AbstractPlugin.shutdown(AbstractPlugin.java:546)
    at kieker.analysis.plugin.AbstractPlugin.shutdown(AbstractPlugin.java:556)
    at kieker.analysis.plugin.AbstractPlugin.shutdown(AbstractPlugin.java:556)
    at kieker.analysis.plugin.AbstractPlugin.shutdown(AbstractPlugin.java:556)
    at kieker.analysis.plugin.AbstractPlugin.shutdown(AbstractPlugin.java:556)
    at kieker.analysis.plugin.AbstractPlugin.shutdown(AbstractPlugin.java:556)
    at kieker.analysis.plugin.AbstractPlugin.shutdown(AbstractPlugin.java:556)
    at kieker.analysis.AnalysisController.terminate(AnalysisController.java:512)
    at kieker.analysis.AnalysisController.terminate(AnalysisController.java:489)
    at kieker.analysis.AnalysisController.run(AnalysisController.java:471)
    at kieker.tools.traceAnalysis.TraceAnalysisTool.dispatchTasks(TraceAnalysisTool.java:880)
    at kieker.tools.traceAnalysis.TraceAnalysisTool.main(TraceAnalysisTool.java:1033)

Jun 05, 2013 9:19:27 PM kieker.tools.traceAnalysis.TraceAnalysisTool dispatchTasks
INFO: Saved analysis configuration to file 'E:\kieker working copy\bin\out\traceAnalysis.kax'
rju commented 3 days ago

author Jan Waller -- Wed, 5 Jun 2013 21:51:26 +0200

You should first set the error state than log than notify the observer, otherwise log might produce a new error.

rju commented 3 days ago

author Jan Waller -- Thu, 6 Jun 2013 12:41:46 +0200

The TraceAnalysis Tool does currently not check for the state of the analysis after finishing it. This should probably be changed.

If state==failed handle it properly in the Tool.

rju commented 3 days ago

author André van Hoorn -- Mon, 10 Jun 2013 10:53:11 +0200

Replying to [jwa|comment:23]:
> The TraceAnalysis Tool does currently not check for the state of the analysis after finishing it. This should probably be changed.
>
> If state==failed handle it properly in the Tool.

Thanks, good point. Improved error handling and reporting in changeset:c2b276ecca21ea092dfde094e6c2dc76fe522eff/kieker-git

Error handling with log from KIEKER-821 Done

Expected (and actual) behavior:
1. Error reported on stderr
2. Refer to kieker.log for details
3. Return code != 0 (see prompt after execution)

avhAndrè van Hoorn-ThinkPad-RSS:0:~/git_work/kieker <master> $ bin/trace-analysis.sh -i /tmp/807-log/ -o tmp/ -p 1- --plot-Call-Trees --plot-Aggregated-Assembly-Call-Tree --print-Message-Traces --print-Execution-Traces --print-invalid-Execution-Traces --print-Deployment-Equivalence-Classes --print-Assembly-Equivalence-Classes

...

An error occured

See 'kieker.log' for details

real    1m35.644s
user    1m56.119s
sys     0m14.721s
avhAndrè van Hoorn-ThinkPad-RSS:1:~/git_work/kieker <master> $

avhAndrè van Hoorn-ThinkPad-RSS:0:~/git_work/kieker <master> $ tail -n 50 kieker.log 
INFO: < Loading /tmp/807-log/kieker-20130307-170552802-UTC-000-Thread-1.dat
Jun 10, 2013 10:46:19 AM kieker.analysis.plugin.reader.filesystem.FSDirectoryReader run
INFO: < Loading /tmp/807-log/kieker-20130307-170558634-UTC-000-Thread-1.dat
Jun 10, 2013 10:46:20 AM kieker.analysis.plugin.reader.filesystem.FSDirectoryReader run
INFO: < Loading /tmp/807-log/kieker-20130307-170605196-UTC-000-Thread-1.dat
Jun 10, 2013 10:46:28 AM kieker.analysis.AnalysisController run
SEVERE: Exception while reading on Reader 'FSReader-2' (FSReader).
java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOfRange(Arrays.java:3209)
        at java.lang.String.<init>(String.java:215)
        at java.lang.StringBuilder.toString(StringBuilder.java:430)
        at kieker.tools.traceAnalysis.systemModel.ExecutionTrace.toString(ExecutionTrace.java:240)
        at kieker.tools.traceAnalysis.filter.traceWriter.ExecutionTraceWriterFilter.newExecutionTrace(ExecutionTraceWriterFilter.java:108)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at kieker.analysis.plugin.AbstractPlugin.deliver(AbstractPlugin.java:195)
        at kieker.tools.traceAnalysis.filter.flow.TraceEventRecords2ExecutionAndMessageTraceFilter.inputTraceEvents(TraceEventRecords2ExecutionAndMessageTraceFilter.java:175)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at kieker.analysis.plugin.AbstractPlugin.deliver(AbstractPlugin.java:195)
        at kieker.analysis.plugin.filter.flow.EventRecordTraceReconstructionFilter.newEvent(EventRecordTraceReconstructionFilter.java:200)
        at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at kieker.analysis.plugin.AbstractPlugin.deliver(AbstractPlugin.java:195)
        at kieker.analysis.plugin.filter.select.TraceIdFilter.inputTraceEvent(TraceIdFilter.java:150)
        at kieker.analysis.plugin.filter.select.TraceIdFilter.inputCombined(TraceIdFilter.java:125)
        at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at kieker.analysis.plugin.AbstractPlugin.deliver(AbstractPlugin.java:195)
        at kieker.analysis.plugin.filter.select.TimestampFilter.inputTraceEvent(TimestampFilter.java:193)
        at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at kieker.analysis.plugin.AbstractPlugin.deliver(AbstractPlugin.java:195)
Jun 10, 2013 10:46:28 AM kieker.analysis.AnalysisController terminate
INFO: Error during analysis. Terminating ...
Jun 10, 2013 10:46:28 AM kieker.analysis.plugin.reader.filesystem.FSReader terminate
INFO: Shutting down reader.
Jun 10, 2013 10:46:28 AM kieker.tools.traceAnalysis.filter.traceReconstruction.TraceReconstructionFilter terminate
INFO: terminate called with error an flag set or a trace processing occurred; won't process timeoutqueue any more.
Jun 10, 2013 10:46:28 AM kieker.tools.traceAnalysis.TraceAnalysisTool dispatchTasks
SEVERE: Analysis instance terminated in state other thanTERMINATED:FAILED
Jun 10, 2013 10:46:28 AM kieker.tools.traceAnalysis.TraceAnalysisTool dispatchTasks
INFO: Saved analysis configuration to file '/home/avh/git_work/kieker/tmp/1-traceAnalysis.kax'

Regression for valid log

Expected (and actual) behavior:
1. No error reported on stderr
2. Refer to kieker.log for details
3. Return code == 0 (see prompt after execution)

avhAndrè van Hoorn-ThinkPad-RSS:0:~/git_work/kieker <master> $ bin/trace-analysis.sh -i examples/userguide/ch5--trace-monitoring-aspectj/testdata/kieker-20100830-082225522-UTC/ -o tmp/ -p 1- --plot-Call-Trees --plot-Aggregated-Assembly-Call-Tree --print-Message-Traces --print-Execution-Traces --print-invalid-Execution-Traces --print-Deployment-Equivalence-Classes --print-Assembly-Equivalence-Classes

...

Analysis completed successfully

See 'kieker.log' for details

real    0m2.975s
user    0m6.772s
sys     0m0.352s
avhAndrè van Hoorn-ThinkPad-RSS:0:~/git_work/kieker <master> $ 

avhAndrè van Hoorn-ThinkPad-RSS:0:~/git_work/kieker <master> $ tail -n 50 kieker.log 
Jun 10, 2013 10:48:48 AM kieker.tools.traceAnalysis.TraceAnalysisTool main
INFO: Argument list: [-i, examples/userguide/ch5--trace-monitoring-aspectj/testdata/kieker-20100830-082225522-UTC/, -o, tmp/, -p, 1-, --plot-Call-Trees, --plot-Aggregated-Assembly-Call-Tree, --print-Message-Traces, --print-Execution-Traces, --print-invalid-Execution-Traces, --print-Deployment-Equivalence-Classes, --print-Assembly-Equivalence-Classes]
Jun 10, 2013 10:48:48 AM kieker.analysis.plugin.reader.filesystem.FSDirectoryReader run
INFO: < Loading /home/avh/git_work/kieker/examples/userguide/ch5--trace-monitoring-aspectj/testdata/kieker-20100830-082225522-UTC/kieker-20100830-082225582-UTC-Thread-2.dat
Jun 10, 2013 10:48:49 AM kieker.analysis.AnalysisController terminate
INFO: Terminating analysis.
Jun 10, 2013 10:48:49 AM kieker.analysis.plugin.reader.filesystem.FSReader terminate
INFO: Shutting down reader.
Jun 10, 2013 10:48:51 AM kieker.tools.traceAnalysis.TraceAnalysisTool dispatchTasks
INFO: Saved analysis configuration to file '/home/avh/git_work/kieker/tmp/1-traceAnalysis.kax'
rju commented 3 days ago

author Jan Waller -- Mon, 10 Jun 2013 14:00:42 +0200

so, close fixed?