microsoft / gctoolkit

Tool for parsing GC logs
https://github.com/microsoft/gctoolkit
MIT License
1.25k stars 153 forks source link

Gctoolkit can not analyze the timeOfLastEvent properly #201

Closed ZhangShushu123 closed 2 years ago

ZhangShushu123 commented 2 years ago

In preunified CMS log, the last event will be a ParNew Event. However, the mailbox in GCToolkitVert is CMS_TENURED_POOL_PARSER_OUTBOX which is defined in initMailBox.

ZhangShushu123 commented 2 years ago

The timeOfLastEvent in below Log is 2022-01-07T02:04:00.482+0800 + duration rather than 2022-01-07T02:03:53.462+0800 + duration.

2022-01-07T02:03:53.462+0800: 3118.557: [CMS-concurrent-reset: 0.011/0.011 secs] [Times: user=0.02 sys=0.01, real=0.01 secs]
2022-01-07T02:04:00.482+0800: 3125.577: [GC (Allocation Failure) 2022-01-07T02:04:00.482+0800: 3125.577: [ParNew: 909774K->57145K(943744K), 0.0535306 secs] 1498304K->670630K(4089472K), 0.0539072 secs] [Times: user=0.28 sys=0.08, real=0.05 secs]

The related code

private String initMailBox(Diary diary) {
        if ( diary.isG1GC()) return GCToolkitVertx.G1GC_PARSER_OUTBOX;
        if ( diary.isGenerational())
            return (diary.isCMS()) ? GCToolkitVertx.CMS_TENURED_POOL_PARSER_OUTBOX :GCToolkitVertx.GENERATIONAL_HEAP_PARSER_OUTBOX;
        if ( diary.isZGC()) return GCToolkitVertx.ZGC_PARSER_OUTBOX;
        if ( diary.isShenandoah()) return GCToolkitVertx.SHENANDOAH_PARSER_OUTBOX;
        return GCToolkitVertx.GENERATIONAL_HEAP_PARSER_OUTBOX;
}

and

@Override
    public void start() {
        try {
            vertx.eventBus().
                    consumer(mailBox, message -> {
                        try {
                            JVMEvent event = (JVMEvent) message.body();
                            if (event instanceof JVMTermination)
                                return;
                            DateTimeStamp now = event.getDateTimeStamp().add(event.getDuration());
                            if (now.after(timeOfLastEvent)) {
                                timeOfLastEvent = now;
                            }
                        } catch (Throwable t) {
                            LOGGER.throwing(this.getClass().getName(), "start", t);
                        }
                    });
        } catch (Throwable t) {
            LOGGER.throwing(this.getClass().getName(), "start", t);
        }
    }

I trust you will come up with a good solution to this problem.

kcpeppe commented 2 years ago

This is a well known issue… the plan is to provided a unified view of time.

dsgrieve commented 2 years ago

FWIW, the RuntimeAggregation in the samples directory does the right thing here.

ZhangShushu123 commented 2 years ago

RuntimeAggregation can't catch EventSource.GENERATIONAL and EventSource.TENURED in the same time with the reason that Verticle of Tenured and Verticle of Generational will construct RuntimeAggregation object respectively.

dsgrieve commented 2 years ago

@ZhangShushu123 - That is true, which is why there is a need for a unified view of time. I created RuntimeAggregation because I had an Aggregation that needed runtime duration to do a calculation and there was no other way to get the runtime. Which is another reason why there is a need for a unified view of time.

One thought is to put runtime duration into the JVMEvent itself, which would probably involve the method you listed above.

kcpeppe commented 2 years ago

The real issue is having a consistent start time. If your log starts from 0.000, this is easy. But if your log starts from 2345432.000 then this is also easy. But if you look starts at 800.000 it starts getting a wee bit trickier to know what value to use. Your choice will all of the analytics that depend on knowing the exact runtime duration. IMO, the answer is to have JavaVirtualMachine keep a global view of the clock. It will keep the decisions consistent which turns out to be as important if not more important than being precise.

ZhangShushu123 commented 2 years ago

I agree @kcpeppe 's opinion very much. I think we should maintain all JvmEvent in JavaVirtualMachine to get a precise timeOfLastEvent and duration.

ghost commented 2 years ago

This issue is resolved in the latest PR. I'll push a release for it to Maven Central very shortly