Fine-tune CI Workflows in PR

krapie commented 1 month ago

Description: Currently, all PRs trigger bench or sharding_test CI workflows, which may not be necessary for every code change. It would be beneficial to fine-tune the workflows so that bench CI is only run on CRDT-related code changes, sharding_test CI is only run on DB-related code changes, and basic CI is run on all other code changes. This will optimize the CI process and reduce unnecessary testing.

Why: By adjusting the CI workflows to be more specific based on code changes, we can improve efficiency, reduce run times, and ensure that the appropriate tests are executed for each PR. This will streamline the development process and enhance code quality.

binary-ho commented 1 month ago

I'm interested in this. can I try it?

sejongk commented 1 month ago

I'm interested in this. can I try it?

@binary-ho Sure. If you have any questions, feel free to ask.

binary-ho commented 1 month ago

Hello, I have questions for this issue.

[summary]

The test/bench package contains various types of tests. Should these tests "only" be executed when there are changes related to CRDT?
If not test/bench as a whole, how about categorizing the 8 different tests into separate packages and running each test package only when the associated code is modified?
I lack a clear understanding of what "CRDT-related code" exactly refers to. Does it specifically mean only the packages pkg/document, pkg/index, pkg/llrb, pkg/splay, pkg/locker, and server/packs?

1. Types of Bench Tests

Currently, the bench CI runs make bench, which executes all the test files in the test/bench package. The bench CI currently tests a variety of targets. Upon investigation, I found that there are 8 test files in total. I also ran the bench tests locally to check the time each test occupies within the overall bench test time.

Total time: 208.793s (26594374030 ns)
1. Document: 42.352% (11263222942 ns)
2. gRPC: 21.399% (5690968294 ns)
3. Locker: 4.696% (1248.75 ns)
4. PushPull: 0.762% (202707232 ns)
5. Snapshot: 0.608% (161724494 ns)
6. Sync: 0.0816% (21725827 ns)
7. TextEditing: 24.402% (6489596461 ns)
8. TreeEditing: 10.395% (2764427531 ns)

I haven't fully understood all these test codes, but should all these tests be triggered only when there are changes related to CRDT?

Are bench tests unnecessary in other scenarios?

2. Suggestion of Separating the `test/bench` Package

If not all tests are related to CRDT, it might not be appropriate to run the entire bench package only on CRDT-related changes.

If necessary, we could separate the packages.

Separating the packages could allow for more granular CI adjustments.

(The following is just an example and may not be the correct categorization due to my limited understanding of Yorkie.)

Examples

test/bench/crdt: Document, Locker, TestEditing, TreeEditing (comprising about 77.149% of the total bench test time)
test/bench/grpc: gRPC (comprising about 21.399% of the total bench test time)
test/bench/server: Snapshot, PushPull, Sync

After this separation, the test code for each package could be executed only when changes are made to the related package. example) test/bench/grpc run always and test/bench/crdt ans test/bench/server run when changes ard made in crdt related codes

If the packages need to be categorized, I would appreciate some help or hints on how to properly categorize them.

3. What Exactly Is "CRDT-related Code"?

I lack a clear understanding of what "CRDT-related code" refers to.

Does it mean only the packages pkg/document, pkg/index, pkg/llrb, pkg/splay, pkg/locker? Does it also include server/packs? or any others?

krapie commented 1 month ago

The test/bench package contains various types of tests. Should these tests "only" be executed when there are changes related to CRDT?

I'm seeing test/bench package are all CRDT-related because it tests on various situations of document editing.

If not test/bench as a whole, how about categorizing the 8 different tests into separate packages and running each test package only when the associated code is modified?

Answered in 1.

I lack a clear understanding of what "CRDT-related code" exactly refers to. Does it specifically mean only the packages pkg/document, pkg/index, pkg/llrb, pkg/splay, pkg/locker, and server/packs?

I define CRDT-related code as the tests that includes document editing between one or more two clients. More specifically, if the test contains below code, I see as CRDT-related code.

err = d1.Update(func(root *json.Object, p *presence.Presence) error {
      root.SetNewText(testKey)
      return nil
})
assert.NoError(b, err)

How do you think about this? @hackerwins

yorkie-team / yorkie