tikv / tikv

Distributed transactional key-value database, originally created to complement TiDB
https://tikv.org
Apache License 2.0
15.04k stars 2.12k forks source link

Refactor gc_worker with tikv_util/worker #10641

Open Little-Wallace opened 3 years ago

Little-Wallace commented 3 years ago

Feature Request

Is your feature request related to a problem? Please describe:

Now we use two thread for gc_worker and gc_manager. But in actually, gc manager does not cost much CPU, we can merge them into one thread.

Describe the feature you'd like:

I hope most of threads in TiKV shall be implemented with tikv_util::worker::pool::Worker so that someday we may merge some threads into one thread pool.

Describe alternatives you've considered:

Teachability, Documentation, Adoption, Migration Strategy:

You may need to know something about GC and threadpool.

Xuanwo commented 3 years ago

I am very interested in this project and I will first take the time to understand the existing logic!

Xuanwo commented 3 years ago

Roadmap after offline nigotiation:

Xuanwo commented 3 years ago

For Takedown GcManager, we need more details for the methods.

Xuanwo commented 3 years ago

@Little-Wallace Hi, is there any advice for keeping this task going further?

Little-Wallace commented 3 years ago

I think we shall merge the GcManagerContext,GcManager , GcWorker and GcRunner into two module. A GcManager holder GcManagerContext and Worker<GcTask> . GcRunner will run in worker of GcManager.

Little-Wallace commented 3 years ago

You can see some change in https://github.com/tikv/tikv/pull/8989/files. But not all of it are suitable. I think server.rs shall never hold an object named GcWorker. It is only hold an object Arc<GcManager> and most interface of GcWorker will move to GcManager.

Little-Wallace commented 3 years ago

At first, you can keep GcManager and GcWorker but just cancel the thread in GcManager. You can see that the Worker has supplied a method named on_timeout which means that you can do something in period. So we can just set a period task, every time refresh the safe point just like what TiKV currently does in GcManager::wait_for_next_safe_point.

Little-Wallace commented 3 years ago

Then we can move the method gc_a_round to GcRunner and just replace GcManager::sync_gc with the method GcRunner::gc

Xuanwo commented 3 years ago

Progress Report at 2021-08-15

Hi, I'm busy with my work last two weeks. I'm reading the logic about GcManager and GcWorker for now.

In the next two weeks, I will try to cancel the thread in GcManager first.

Xuanwo commented 3 years ago

At first, you can keep GcManager and GcWorker but just cancel the thread in GcManager. You can see that the Worker has supplied a method named on_timeout which means that you can do something in period. So we can just set a period task, every time refresh the safe point just like what TiKV currently does in GcManager::wait_for_next_safe_point.

I tried following methods:

Please guide me the correct way before I going further. :smile: