pingcap / tiflash

The analytical engine for TiDB and TiDB Cloud. Try free: https://tidbcloud.com/free-trial
https://docs.pingcap.com/tidb/stable/tiflash-overview
Apache License 2.0
943 stars 410 forks source link

TiFlash failed to replicate data from TiKV in the master branch #9147

Closed JaySon-Huang closed 3 months ago

JaySon-Huang commented 3 months ago

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

2. What did you expect to see? (Required)

3. What did you see instead (Required)

https://do.pingcap.net/jenkins/blue/organizations/jenkins/pingcap%2Ftiflash%2Fpull_integration_test/detail/pull_integration_test/492/pipeline/372

In the CI intergrate testing, tiflash failed to replica data from TiKV, with error logging like

[2024/06/14 11:44:39.043 +08:00] [ERROR] [kv.rs:774] ["dispatch raft msg from gRPC to raftstore fail"] [err=Grpc(Codec(WireError(InvalidEnumValue(15))))] [thread_id=0x5]
[2024/06/14 11:44:39.043 +08:00] [ERROR] [kv.rs:781] ["KvService::batch_raft send response fail"] [err=RemoteStopped] [thread_id=0x5]
[2024/06/14 11:44:44.046 +08:00] [INFO] [kv.rs:737] ["batch_raft RPC is called, new gRPC stream established"] [source_store_id=Some(1)] [thread_id=0x5]
[2024/06/14 11:44:44.047 +08:00] [ERROR] [kv.rs:774] ["dispatch raft msg from gRPC to raftstore fail"] [err=Grpc(Codec(WireError(InvalidEnumValue(15))))] [thread_id=0x5]
[2024/06/14 11:44:44.047 +08:00] [ERROR] [kv.rs:781] ["KvService::batch_raft send response fail"] [err=RemoteStopped] [thread_id=0x5]
[2024/06/14 11:44:49.054 +08:00] [INFO] [kv.rs:737] ["batch_raft RPC is called, new gRPC stream established"] [source_store_id=Some(1)] [thread_id=0x5]

It's because tikv introduce a new precheck process before snapshot generation https://github.com/tikv/tikv/pull/17019 for resolving issue https://github.com/tikv/tikv/issues/15972. While tiflash-proxy does not updated and don't recognize the new type of raft message.

4. What is your TiFlash version? (Required)

master

Yui-Song commented 3 months ago

@CalvinNeo Do we have UT for replicating data from TiKV? From my point of view, it's a very basic feature.

CalvinNeo commented 3 months ago

@CalvinNeo Do we have UT for replicating data from TiKV? From my point of view, it's a very basic feature.

We have ut in both tiflash and tiflash proxy. Check https://github.com/pingcap/tidb-engine-ext/tree/raftstore-proxy/proxy_tests for example. However, this is not a case that a UT should cover, because the mutation happened in TiKV.

seiya-annie commented 3 months ago

/found customer

JaySon-Huang commented 3 months ago

/remove-label found/customer

ti-chi-bot[bot] commented 3 months ago

@JaySon-Huang: The label(s) found/customer cannot be applied. These labels are supported: tide/merge-method-rebase, tide/merge-method-squash, tide/merge-method-merge, duplicate, first-time-contributor, good first issue, ok-to-test, needs-ok-to-test, help wanted, invalid, question, wontfix, make-local-great-again, needs-cherry-pick-release-5.4, needs-cherry-pick-release-6.1, needs-cherry-pick-release-6.5, needs-cherry-pick-release-7.1, needs-cherry-pick-release-7.5, needs-cherry-pick-release-8.1, affects-5.4, affects-6.1, affects-6.5, affects-7.1, affects-7.5, affects-8.1, may-affects-5.4, may-affects-6.1, may-affects-6.5, may-affects-7.1, may-affects-7.5, may-affects-8.1, needs-rebase.

In response to [this](https://github.com/pingcap/tiflash/issues/9147#issuecomment-2178364967): >/remove-label found/customer Instructions for interacting with me using PR comments are available [here](https://prow.tidb.net/command-help). If you have questions or suggestions related to my behavior, please file an issue against the [ti-community-infra/tichi](https://github.com/ti-community-infra/tichi/issues/new?title=Prow%20issue:) repository.