pingcap / tidb

TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://www.pingcap.com/tidb-serverless/
https://pingcap.com
Apache License 2.0
36.4k stars 5.73k forks source link

*: support bazel #33691

Closed hawkingrei closed 2 years ago

hawkingrei commented 2 years ago

What problem does this PR solve?

Issue Number: close #34083

Problem Summary:

What is changed and how it works?

Check List

Tests

Side effects

Documentation

Release note

None
ti-chi-bot commented 2 years ago

[REVIEW NOTIFICATION]

This pull request has been approved by:

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment. After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review. Reviewer can cancel approval by submitting a request changes review.
sre-bot commented 2 years ago

Please follow PR Title Format:

Or if the count of mainly changed packages are more than 3, use

sre-bot commented 2 years ago

Code Coverage Details: https://codecov.io/github/pingcap/tidb/commit/c277dbcc9edbc53c5c7f147c1f23e2236e06e473

sre-bot commented 2 years ago

Please follow PR Title Format:

Or if the count of mainly changed packages are more than 3, use

sre-bot commented 2 years ago

Please follow PR Title Format:

Or if the count of mainly changed packages are more than 3, use

sre-bot commented 2 years ago

Please follow PR Title Format:

Or if the count of mainly changed packages are more than 3, use

sre-bot commented 2 years ago

Please follow PR Title Format:

Or if the count of mainly changed packages are more than 3, use

sre-bot commented 2 years ago

Please follow PR Title Format:

Or if the count of mainly changed packages are more than 3, use

sre-bot commented 2 years ago

Please follow PR Title Format:

Or if the count of mainly changed packages are more than 3, use

sre-bot commented 2 years ago

Please follow PR Title Format:

Or if the count of mainly changed packages are more than 3, use

zz-jason commented 2 years ago

could you explain more about why adding this support?

hawkingrei commented 2 years ago

could you explain more about why adding this support?

We run thousands of builds and tests in a day. But I find that we have not any local cache and remote cache to speed up ci. so I import the bazel into tidb to solve this problem. I believe that it can boost performance 10x.

hawkingrei commented 2 years ago

could you explain more about why adding this support?

if I meet an unstable test in the CI, I will just rerun this test case by the bazel. if I have a test case that it just tests util package and util package have not any changes, we can use the remote cache to skip this test. image

image.png

sre-bot commented 2 years ago

Please follow PR Title Format:

Or if the count of mainly changed packages are more than 3, use

sre-bot commented 2 years ago

Please follow PR Title Format:

Or if the count of mainly changed packages are more than 3, use

sre-bot commented 2 years ago

Please follow PR Title Format:

Or if the count of mainly changed packages are more than 3, use

sre-bot commented 2 years ago

Please follow PR Title Format:

Or if the count of mainly changed packages are more than 3, use

breezewish commented 2 years ago

How about using a local cache that is provided by Go natively? For example, modern CI like GitHub Action provides per-branch cache, which is sufficient to be used as a cache, even incremental one.

I made a demo of utilizing GitHub Action cache:

  1. Full test run

    ok   github.com/breeswish/go-test-cache-demo/bar 222.714s
    ok   github.com/breeswish/go-test-cache-demo/box 22.670s
    ok   github.com/breeswish/go-test-cache-demo/foo 222.729s
  2. With a cache, update a single package (commit): You can see that test results of untouched packages are reused (from the master branch)

    ok   github.com/breeswish/go-test-cache-demo/bar 220.462s
    ok   github.com/breeswish/go-test-cache-demo/box (cached)
    ok   github.com/breeswish/go-test-cache-demo/foo (cached)
  3. In the same PR, make change to non-code (commit): You can see the cache from 2 is effective, so that CI is finished immediately, only take seconds

    ok   github.com/breeswish/go-test-cache-demo/bar (cached)
    ok   github.com/breeswish/go-test-cache-demo/box (cached)
    ok   github.com/breeswish/go-test-cache-demo/foo (cached)

We can use the same strategy in our own CI infrastructure to utilize the same effect:

  1. For master commit: Test and save cache
  2. For PR commit: Read a recent cache from current PR (if exists), or a recent cache from master

In this way, PR test can be fast, when the PR is first proposed, or the PR is updated incrementally.

sre-bot commented 2 years ago

Please follow PR Title Format:

Or if the count of mainly changed packages are more than 3, use

sre-bot commented 2 years ago

Please follow PR Title Format:

Or if the count of mainly changed packages are more than 3, use

sre-bot commented 2 years ago

Please follow PR Title Format:

Or if the count of mainly changed packages are more than 3, use

sre-bot commented 2 years ago

Please follow PR Title Format:

Or if the count of mainly changed packages are more than 3, use

sre-bot commented 2 years ago

Please follow PR Title Format:

Or if the count of mainly changed packages are more than 3, use

sre-bot commented 2 years ago

Please follow PR Title Format:

Or if the count of mainly changed packages are more than 3, use

sre-bot commented 2 years ago

Please follow PR Title Format:

Or if the count of mainly changed packages are more than 3, use

sre-bot commented 2 years ago

Please follow PR Title Format:

Or if the count of mainly changed packages are more than 3, use

sre-bot commented 2 years ago

Please follow PR Title Format:

Or if the count of mainly changed packages are more than 3, use

sre-bot commented 2 years ago

Please follow PR Title Format:

Or if the count of mainly changed packages are more than 3, use

sre-bot commented 2 years ago

Please follow PR Title Format:

Or if the count of mainly changed packages are more than 3, use

sre-bot commented 2 years ago

Please follow PR Title Format:

Or if the count of mainly changed packages are more than 3, use

sre-bot commented 2 years ago

Please follow PR Title Format:

Or if the count of mainly changed packages are more than 3, use

sre-bot commented 2 years ago

Please follow PR Title Format:

Or if the count of mainly changed packages are more than 3, use

sre-bot commented 2 years ago

Please follow PR Title Format:

Or if the count of mainly changed packages are more than 3, use

sre-bot commented 2 years ago

Please follow PR Title Format:

Or if the count of mainly changed packages are more than 3, use

sre-bot commented 2 years ago

Please follow PR Title Format:

Or if the count of mainly changed packages are more than 3, use

sre-bot commented 2 years ago

Please follow PR Title Format:

Or if the count of mainly changed packages are more than 3, use

sre-bot commented 2 years ago

Please follow PR Title Format:

Or if the count of mainly changed packages are more than 3, use

hawkingrei commented 2 years ago

How about using a local cache that is provided by Go natively? For example, modern CI like GitHub Action provides per-branch cache, which is sufficient to be used as a cache, even incremental one.

I made a demo of utilizing GitHub Action cache:

1. [Full test run](https://github.com/breeswish/go-test-cache-demo/runs/6038147567?check_suite_focus=true)
   ```
   ok     github.com/breeswish/go-test-cache-demo/bar 222.714s
   ok     github.com/breeswish/go-test-cache-demo/box 22.670s
   ok     github.com/breeswish/go-test-cache-demo/foo 222.729s
   ```

2. [With a cache, update a single package](https://github.com/breeswish/go-test-cache-demo/runs/6038730204?check_suite_focus=true) ([commit](https://github.com/breeswish/go-test-cache-demo/pull/5/commits/0ef0e72effcf6f3fbe620149b0e817d88b5d359e)): You can see that test results of untouched packages are reused (from the master branch)
   ```
   ok     github.com/breeswish/go-test-cache-demo/bar 220.462s
   ok     github.com/breeswish/go-test-cache-demo/box (cached)
   ok     github.com/breeswish/go-test-cache-demo/foo (cached)
   ```

3. [In the same PR, make change to non-code](https://github.com/breeswish/go-test-cache-demo/runs/6038894438?check_suite_focus=true) ([commit](https://github.com/breeswish/go-test-cache-demo/pull/5/commits/a6a20405b9c89f91dec6d8f62bb203f1a4fd213f)): You can see the cache from 2 is effective, so that CI is finished immediately, only take seconds
   ```
   ok     github.com/breeswish/go-test-cache-demo/bar (cached)
   ok     github.com/breeswish/go-test-cache-demo/box (cached)
   ok     github.com/breeswish/go-test-cache-demo/foo (cached)
   ```

We can use the same strategy in our own CI infrastructure to utilize the same effect:

1. For master commit: Test and save cache

2. For PR commit: Read a recent cache from current PR (if exists), or a recent cache from master

In this way, PR test can be fast, when the PR is first proposed, or the PR is updated incrementally.

Why we cannot use the golang

How about using a local cache that is provided by Go natively? For example, modern CI like GitHub Action provides per-branch cache, which is sufficient to be used as a cache, even incremental one.

I made a demo of utilizing GitHub Action cache:

1. [Full test run](https://github.com/breeswish/go-test-cache-demo/runs/6038147567?check_suite_focus=true)
   ```
   ok     github.com/breeswish/go-test-cache-demo/bar 222.714s
   ok     github.com/breeswish/go-test-cache-demo/box 22.670s
   ok     github.com/breeswish/go-test-cache-demo/foo 222.729s
   ```

2. [With a cache, update a single package](https://github.com/breeswish/go-test-cache-demo/runs/6038730204?check_suite_focus=true) ([commit](https://github.com/breeswish/go-test-cache-demo/pull/5/commits/0ef0e72effcf6f3fbe620149b0e817d88b5d359e)): You can see that test results of untouched packages are reused (from the master branch)
   ```
   ok     github.com/breeswish/go-test-cache-demo/bar 220.462s
   ok     github.com/breeswish/go-test-cache-demo/box (cached)
   ok     github.com/breeswish/go-test-cache-demo/foo (cached)
   ```

3. [In the same PR, make change to non-code](https://github.com/breeswish/go-test-cache-demo/runs/6038894438?check_suite_focus=true) ([commit](https://github.com/breeswish/go-test-cache-demo/pull/5/commits/a6a20405b9c89f91dec6d8f62bb203f1a4fd213f)): You can see the cache from 2 is effective, so that CI is finished immediately, only take seconds
   ```
   ok     github.com/breeswish/go-test-cache-demo/bar (cached)
   ok     github.com/breeswish/go-test-cache-demo/box (cached)
   ok     github.com/breeswish/go-test-cache-demo/foo (cached)
   ```

We can use the same strategy in our own CI infrastructure to utilize the same effect:

1. For master commit: Test and save cache

2. For PR commit: Read a recent cache from current PR (if exists), or a recent cache from master

In this way, PR test can be fast, when the PR is first proposed, or the PR is updated incrementally.

The size of TiDB cache which includes all builds and tests is 19GB. it is too big to upload all cache to S3 and download all cache the next time. and we cannot control the job task to make it run on the same machine and use local cache every time. it is hard because Jenkins is not k8s CRD. we cannot schedule anything. so TiDB CI does not have any cache now.

hawkingrei commented 2 years ago

Bazel has a better design. it can use remote cache to achieve distributed compilation and testing.

hawkingrei commented 2 years ago

/check-issue-triage-complete

hawkingrei commented 2 years ago

I just have a simple question. When we updating the go.mod, how we make sure the dependencies in bzl is up to date?

use a robot to verify it.

tiancaiamao commented 2 years ago

Can we avoid adding the BUILD.bazel and put the generating process in the script?

bazel run //:gazelle
hawkingrei commented 2 years ago

Can we avoid adding the BUILD.bazel and put the generating process in the script?

bazel run //:gazelle

Yes, We can do it.

breezewish commented 2 years ago

The size of TiDB cache which includes all builds and tests is 19GB. it is too big to upload all cache to S3 and download all cache the next time. and we cannot control the job task to make it run on the same machine and use local cache every time. it is hard because Jenkins is not k8s CRD. we cannot schedule anything. so TiDB CI does not have any cache now.

This is a good point! Previously in Jenkins CI we usually use local cache. Network cache is too expansive. Maybe you can ask EE for help. I think caching is a totally valid requirement.

hawkingrei commented 2 years ago

Can we avoid adding the BUILD.bazel and put the generating process in the script?

bazel run //:gazelle

Done

tiancaiamao commented 2 years ago

It's a hard decision to make. I'd say, I like the potential performance boost by this change. However, it comes with costs: how do we maintain this toolchain in the future? Bazel has its learning curve, it's difficult for find someone else who is familiar with bazel within the team.

Also, mentioned here https://github.com/golang/go/issues/42785#issuecomment-740017077

it's detrimental for builds

It's better just to use it for CI test environment, we still use go to build the release binary.

hawkingrei commented 2 years ago

It's a hard decision to make. I'd say, I like the potential performance boost by this change. However, it comes with costs: how do we maintain this toolchain in the future? Bazel has its learning curve, it's difficult for find someone else who is familiar with bazel within the team.

Also, mentioned here golang/go#42785 (comment)

it's detrimental for builds

It's better just to use it for CI test environment, we still use go to build the release binary.

bazel just is a tool that can manage the build/test cache and execute distributed builds and tests. His configuration can be automatically generated. It is easy to use. so we still can go to build the release binary.

hawkingrei commented 2 years ago

This change looks good to me. However, could you elaborate the following works in #34083?

This is a fundamental build system change and it deserves a design doc for the whole lifecycle and how other contributors should work with it.

it will be done, We still fight with Jenkins ci. It is so hard.

breezewish commented 2 years ago

Just curious, if these BAZEL files are automatically generated, can they be git-ignored? It's usually not a good practice to include generated files in the git repository.

hawkingrei commented 2 years ago

Just curious, if these BAZEL files are automatically generated, can they be git-ignored? It's usually not a good practice to include generated files in the git repository.

In fact, Some bazel files are automatically generated. for example flaky test (It isn't recommended) and some other config have to be controlled by development such as visibility and test size that controls the test timeout. so it must be with the repo.

BTW, the open resource repos using the bazel like abseil-cpp, cockroach, tensorflow have many BAZEl.build files in their repo.