PingCAP-QE / ci

Continue intergration tests
Apache License 2.0
19 stars 97 forks source link

chore(tiflash): update tiflash new merged ut #2938

Closed purelind closed 2 months ago

ti-chi-bot[bot] commented 2 months ago

I have already done a preliminary review for you, and I hope to help you do a better job.

The Pull Request (PR) involves changes to the Continuous Integration/Continuous Deployment (CI/CD) pipeline script (merged_unit_test.groovy) for the TiFlash project.

Key changes:

  1. The timeout for the 'Checkout' stage is reduced from 120 minutes to 15 minutes.
  2. The code checkout process is updated. Instead of directly cloning the code from GitHub, the new logic retrieves a tarball of the source code (src-tiflash.tar.gz) from a file server, extracts it, changes the ownership, and then executes git status.
  3. The installation steps for tools like ccache, cmake, clang-format, clang-tidy, and gcovr are slightly modified. No major changes.
  4. The 'Prepare Cache' stage logic is updated, improving the handling of the cache files. Also, slight changes in the path of ccache_tar_file.
  5. In the "Configure Project" stage, the -DTEST_LLVM_COVERAGE=ON flag was removed.
  6. The logic for the "Build TiFlash" stage is changed slightly in the way it builds the targets. Also, the 'read_only' option for ccache is now set to true.
  7. The "Post Build" stage has been reorganized, and the logic for updating ccache has been removed.
  8. The "Coverage" stage has been entirely removed from the pipeline.

Potential Problems:

  1. Reducing the timeout time for the 'Checkout' stage could lead to failures if the checkout operation takes longer than expected.
  2. The new checkout logic assumes the tarball exists on the file server and might fail if the tarball isn't present or doesn't have the expected contents.
  3. The changes in the tool installation scripts might cause issues if the new scripts do not correctly handle all scenarios (tool already installed, installation failures, etc).
  4. The removal of the -DTEST_LLVM_COVERAGE=ON flag will stop the generation of coverage data.
  5. The removal of the "Coverage" stage will stop the generation of coverage reports.
  6. Changing 'read_only' option for ccache to true could cause issues if the pipeline tries to write to the cache.

Suggested Fixes:

  1. Perform testing to ensure the reduced timeout for checkout is adequate for all scenarios.
  2. Ensure that the tarball exists on the file server and contains the correct source code. Add error handling for scenarios where the tarball doesn't exist or fails to extract correctly.
  3. Test the new tool installation scripts in various scenarios to ensure they work correctly.
  4. If coverage data is still required, consider adding back the -DTEST_LLVM_COVERAGE=ON flag.
  5. If coverage reports are still required, consider adding back the "Coverage" stage.
  6. Evaluate the consequences of setting 'read_only' to true for ccache. If the pipeline needs to write to the cache, this setting could cause failures.
ti-chi-bot[bot] commented 2 months ago

[LGTM Timeline notifier]

Timeline:

ti-chi-bot[bot] commented 2 months ago

New changes are detected. LGTM label has been removed.

ti-chi-bot[bot] commented 2 months ago

[APPROVALNOTIFIER] This PR is APPROVED

Approval requirements bypassed by manually added approval.

This pull-request has been approved by:

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - **[jobs/OWNERS](https://github.com/PingCAP-QE/ci/blob/main/jobs/OWNERS)** - **[pipelines/OWNERS](https://github.com/PingCAP-QE/ci/blob/main/pipelines/OWNERS)** Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
ti-chi-bot[bot] commented 2 months ago

I have already done a preliminary review for you, and I hope to help you do a better job.

The pull request titled "chore(tiflash): update tiflash new merged ut" proposes several changes in the Groovy scripts related to the Jenkins pipeline for the 'tiflash' project. Here are the key changes:

  1. The 'disabled' line in the 'pingcap/tiflash/merged_unit_test' job has been removed, enabling the job.

  2. The 'Checkout' stage has been significantly refactored. Previously, the 'git checkout' operation was performed directly with a retry mechanism. Now, the code first checks if the necessary files are present in a remote storage 'ks3://ee-fileserver/download/cicd/daily-cache-code/src-tiflash.tar.gz'. If they are, the files are downloaded, extracted, and used directly.

  3. The timeout for the 'Checkout' stage has been reduced from 120 minutes to 15 minutes.

  4. The 'Prepare tools' stage has been reformatted, but the logic remains the same.

  5. The 'Proxy-Cache' stage has been refactored, with the path to the 'proxy_cache_file' being changed.

  6. The 'Configure Project' stage has an added TODO comment suggesting simplification of the configuration and build logic.

  7. The 'Coverage' stage has been removed in the PR.

Potential problems and suggestions:

  1. The PR removes the entire 'Coverage' stage, which might affect the test coverage statistics for the project. If test coverage reports are still required, we should consider integrating this functionality back into the pipeline.

  2. The 'Checkout' stage now depends on a remote storage. This introduces a single point of failure in case the remote server is down, or the specific file is not available. It's advisable to add a fallback mechanism to clone from GitHub if the file download fails.

  3. The PR does not provide any context or explanation for the changes made. It would be helpful to include comments explaining the rationale behind these changes, particularly for significant changes like the removal of the 'Coverage' stage.

  4. The PR introduces a hard-coded path to the 'ks3://ee-fileserver/download/cicd/daily-cache-code/src-tiflash.tar.gz' resource. This could be problematic for maintaining and scaling the pipeline. Consider replacing this with a configurable variable.

  5. The PR does not include any code comments on what the changes are doing. Adding comments to the code can greatly improve maintainability and readability.

  6. Ensure the reduced timeout for the 'Checkout' stage is sufficient for the operations performed in this stage.

  7. The PR does not include any new tests or mention of testing the changes. It would be good to ensure the changes have been thoroughly tested before merging.

ti-chi-bot[bot] commented 2 months ago

I have already done a preliminary review for you, and I hope to help you do a better job.

The pull request contains updates to the unit test for the tiflash project. Here are the key changes:

  1. The disabled(true) line was removed, thus enabling the pipeline job.
  2. The checkout stage has been updated to reduce the timeout from 120 minutes to 15 minutes.
  3. The checkout stage script has been significantly changed. Previously, the script would checkout the refs from the git repository, now it removes all files in the current directory, downloads a tar file from a ks3 storage bucket, and extracts the codebase from the tar file.
  4. The stages for installing tools (Ccache, Cmake, Clang-Format, Clang-Tidy, Coverage) have been reformatted but the functionality seems to remain the same.
  5. The "Prepare Cache" stage has been updated to print more information, and the ccache read_only option has been set to true.
  6. The "Build TiFlash" stage has been modified to run fewer commands.
  7. The "Coverage" stage has been removed.

Potential problems:

  1. The new checkout method of downloading and extracting the codebase from a tar file located in a ks3 storage bucket may not be reliable as compared to directly checking out from a git repository. If the tar file is not updated regularly, it might not have the latest code changes.
  2. The "Coverage" stage has been removed. If code coverage is required for the project, this might be a problem.
  3. The ccache read_only option has been set to true. This means that ccache will not be able to update the cache with the results of new compilations. This could potentially slow down the build process.

Fixing suggestions:

  1. If the tar file in the ks3 storage bucket is not updated regularly, it would be better to checkout the code directly from the git repository.
  2. If code coverage is required, consider adding the "Coverage" stage back in.
  3. If the ccache needs to be updated with the results of new compilations, consider setting the read_only option to false.