tikv / tikv

Distributed transactional key-value database, originally created to complement TiDB
https://tikv.org
Apache License 2.0
15.28k stars 2.14k forks source link

UCP: Print panic info into tikv_stderr.log #7220

Open hicqu opened 4 years ago

hicqu commented 4 years ago

Description

Print panic information into tikv_stderr.log to make it easier to search.

Related code is here https://github.com/tikv/tikv/blob/master/components/tikv_util/src/lib.rs#L466

Difficulty

Score

Mentor(s)

Recommended Skills

Learning Materials

skyitachi commented 4 years ago

/pick-up-challenge

sre-bot commented 4 years ago

@skyitachi pick up issue success

sre-bot commented 4 years ago

This pick has been automatically canceled after more than a week.

skyitachi commented 4 years ago

/pick-up-challenge

sre-bot commented 4 years ago

@skyitachi pick up issue success

skyitachi commented 4 years ago

@hicqu this is pr

sre-bot commented 4 years ago

This pick has been automatically canceled after more than a week.

XiaochenCui commented 4 years ago

/pick-up-challenge

sre-bot commented 4 years ago

@XiaochenCui pick up issue success

sre-bot commented 4 years ago

This pick has been automatically canceled after more than a week.

YKG commented 4 years ago

/pick-up-challenge

sre-bot commented 4 years ago

@YKG pick up issue success

BusyJay commented 4 years ago

I think there are several things need to be considered:

  1. tikv_stderr.log doesn't rotate, the file can become very large if tikv keeps panicking.
  2. I remember tikv_stderr.log was truncated on every start in tidb-ansible deployment, should check if it's still true for latest version and tiup deployment.
  3. Searching may not be easy as it seems. When searching panics, there will be contextual information needs to be looked up. Print to stderr will have to search twice.
  4. It seems inconsistent when users specify --log-file but get some logs in stderr.
YKG commented 4 years ago

@BusyJay Do you mean a fix on this issue might be less useful? If it's true, should I stop working on this issue?

I have read the releated code, I think the 4th consideration you point out is expected behavior, because if --log-file specified, tikv will use an async file logger, when it panics, the global logger will be swithed to a stderr terminal logger wrapper , all the pending logs will be logged to stderr.

BusyJay commented 4 years ago

...should I stop working on this issue...

You can still work on the issue, but I think there should be more discussions. For example, adding a configuration to log the trace to stderr in addition to log seems a balance between context and search.

/cc @hicqu

..when it panics, the global logger will be swithed to a stderr terminal logger wrapper...

What you mention is implement details, what I'm talking about is the use experience.

breezewish commented 4 years ago

Users also cannot search in tikv_stderr.log when using TiDB Dashboard, so this change may make it harder to search. Considering that the other output lines in stderr (e.g. produced by grpc / rocksdb) is unlikely to conform our unified log format, we may never have a chance to let user search it.