PingCAP-QE / ci

Continue intergration tests
Apache License 2.0
19 stars 97 forks source link

increase ticdc kafka integration test resource #2934

Closed sdojjy closed 2 months ago

sdojjy commented 2 months ago

ticdc kafka integration test failed frequently, but when we ran it on local env, it never fail, so this PR increase ticdc kafka integration test resource.

ti-chi-bot[bot] commented 2 months ago

I have already done a preliminary review for you, and I hope to help you do a better job.

Based on the title and description, this pull request aims to increase the resource allocation for the ticdc Kafka integration test. The changes in the diff show that the CPU and memory requests and limits for the Kafka container have been increased.

One potential problem with this approach is that it may not necessarily address the root cause of the test failures. Increasing resource allocation can help when the test is hitting resource limits, but it could also be masking other issues such as race conditions or flaky tests. Therefore, it would be better to investigate the root cause of the test failures first before considering increasing resources as a solution.

If increasing resource allocation is deemed necessary, it is important to be mindful of the potential impact on other parts of the system that share the same underlying infrastructure. In this case, the increased resource allocation for the Kafka container could potentially affect the performance of other containers running on the same node.

If the decision is made to go ahead with the resource increase, it would be good to monitor the system performance and adjust the resource allocation as necessary to ensure optimal performance.

Overall, it would be good to have more context on the frequency and nature of the test failures to better understand the root cause and determine the appropriate solution.

ti-chi-bot[bot] commented 2 months ago

I have already done a preliminary review for you, and I hope to help you do a better job.

Based on the pull request's title and description, the key change is an increase in resource allocation for the ticdc kafka integration test. This is done because the test frequently failed, but never failed when run locally.

The diff shows that resources for zookeeper, golang, and kafka have been increased. Specifically, the CPU for zookeeper has been increased from 200m to 2000m, CPU for golang has been increased from 2 to 12, CPU for kafka has been increased from 200m to 4000m, and memory for golang and kafka have both been increased from 16Gi to 32Gi and 4Gi to 6Gi, respectively.

There are no obvious potential problems with this pull request, but it is important to ensure that the increased resource allocation does not negatively impact other tests or processes. It may be worth running a load test to ensure that the resources allocated are appropriate.

In terms of fixing suggestions, it would be helpful to provide more context around why the integration test was failing and why increasing resources was the solution. Additionally, it may be helpful to document the new resource allocations somewhere so that other engineers are aware of the changes.

purelind commented 2 months ago

This is a temporary resource adjustment change. After confirming the issue, heavy testing needs to be migrated to avoid excessive resource occupation by individual pipelines.

ti-chi-bot[bot] commented 2 months ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: purelind

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[pipelines/OWNERS](https://github.com/PingCAP-QE/ci/blob/main/pipelines/OWNERS)~~ [purelind] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
ti-chi-bot[bot] commented 2 months ago

[LGTM Timeline notifier]

Timeline: