pingcap / tidb-dashboard

A Web UI for monitoring, diagnosing and managing the TiDB cluster.
https://docs.pingcap.com/tidb/stable/dashboard-intro
Apache License 2.0
175 stars 131 forks source link

PD profiling failed ocassionally #1146

Open aylei opened 2 years ago

aylei commented 2 years ago

Bug Report

Please answer these questions before submitting your issue. Thanks!

What did you do?

Profile all instances through dashboard

What did you expect to see? All profiles succeed

What did you see instead? Some profiles to PD instances failed

What version of TiDB Dashboard are you using (./tidb-dashboard --version)? v5.4.0

middle_img_v2_314c24e9-d495-478a-9607-df4524564a9g

error msg

dbaas v5.4.0 测试的时候发现 dashboard manual profiling 失败

failed to fetch and write to temp file: failed to fetch profile with *.proto format: error.pd.client_request_failed: Failed to send PD API request, cause: Get "https://db-pd-0.db-pd-peer.tidb1373933076657974810.svc:2379/pd/api/v1/debug/pprof/profile?seconds=30": net/http: HTTP/1.x transport connection broken: malformed HTTP response "\x00\x00\x18\x04\x00\x00\x00\x00\x00\x00\x05\x00\x10\x00\x00\x00\x03\x00\x00\x00\xfa\x00\x06\x00\x10\x01@\x00\x04\x00\x10\x00\x00"
https://staging.debug.tidbcloud.com/orgs/1369847559691367379  
shhdgit commented 2 years ago

According to https://github.com/golang/go/issues/21336#issuecomment-325737598, I'm not sure if using the same tls.Config instance for the proxy and the dashboard client at https://github.com/tikv/pd/blob/master/pkg/dashboard/dashboard.go#L78-L88 would cause this problem.

PTAL, thanks~ @HunDunDM

breezewish commented 2 years ago

We should add TLS into the test cases to make sure that our features works in TLS environments. @shhdgit Could you help set them up?

shhdgit commented 2 years ago

Yeah, sure. And it looks like this case needs to be tested on the pd side too.