apache / incubator-pegasus

Apache Pegasus - A horizontally scalable, strongly consistent and high-performance key-value store
https://pegasus.apache.org/
Apache License 2.0
1.96k stars 310 forks source link

fix(manual_compact): fix replica lose manual compact finished status after replica migrate bug #1961

Open ninsmiracle opened 3 months ago

ninsmiracle commented 3 months ago

What problem does this PR solve?

1665

And cause the misstake operate of PR #1666, I closed it and forced pull it.
So I have to re pull in another pull request.

What is changed and how does it work?

I add another string in meta_store (meta_store is a column families which persist pegasus_last_manual_compact_finish_time and so on). So that we can read the value from meta_store when we recover a replica after replica migrate.

Checklist

Tests

Test Result

Pegasus-AdminCli-1.2.0 » manual-compaction query -a gns
Table gns manual compaction progress is 100
Pegasus-AdminCli-1.2.0 » manual-compaction query -a gns
Table gns manual compaction progress is 83
Pegasus-AdminCli-1.2.0 » manual-compaction query -a gns
Table gns manual compaction progress is 83
Pegasus-AdminCli-1.2.0 » manual-compaction query -a gns
Table gns manual compaction progress is 100
Code changes
acelyc111 commented 3 months ago

It's OK to me to remove the periodic manual compaction function, it can be replaced by thirdparty tools, feel free to remove it.

ninsmiracle commented 3 months ago

It's OK to me to remove the periodic manual compaction function, it can be replaced by thirdparty tools, feel free to remove it.

I think it's reasonable to keep this periodic manual compaction function. Cause many user using spark to set manual compaction time now. If we remove this function in new version, we have to change the change user habits. Actully, I think we lack a unified management platform to control every things.

acelyc111 commented 2 months ago

It's OK to me to remove the periodic manual compaction function, it can be replaced by thirdparty tools, feel free to remove it.

I think it's reasonable to keep this periodic manual compaction function. Cause many user using spark to set manual compaction time now. If we remove this function in new version, we have to change the change user habits. Actully, I think we lack a unified management platform to control every things.

What is "using spark to set manual compaction time" ? Is it a Pegasus binding method? I don't think it's a block to remove the periodic manual compaction. This is not the core tasks of Pegasus, it make the code a bit of smelly.

At least, you need to make this patch to resolve these issues mentioned above.

ninsmiracle commented 2 months ago

It's OK to me to remove the periodic manual compaction function, it can be replaced by thirdparty tools, feel free to remove it.

I think it's reasonable to keep this periodic manual compaction function. Cause many user using spark to set manual compaction time now. If we remove this function in new version, we have to change the change user habits. Actully, I think we lack a unified management platform to control every things.

What is "using spark to set manual compaction time" ? Is it a Pegasus binding method? I don't think it's a block to remove the periodic manual compaction. This is not the core tasks of Pegasus, it make the code a bit of smelly.

At least, you need to make this patch to resolve these issues mentioned above.

In high version of PEGASUS-SPARK , a compaction request will be send to cluster after user call startBulkLoad . And I will update PEGASUS-SPARK this week. I do not think we should remove perodic manual compaction code now . Cause I think all its bugs about manual compaction have been fixed, and they are all minor issues.

acelyc111 commented 2 months ago

It's OK to me to remove the periodic manual compaction function, it can be replaced by thirdparty tools, feel free to remove it.

I think it's reasonable to keep this periodic manual compaction function. Cause many user using spark to set manual compaction time now. If we remove this function in new version, we have to change the change user habits. Actully, I think we lack a unified management platform to control every things.

What is "using spark to set manual compaction time" ? Is it a Pegasus binding method? I don't think it's a block to remove the periodic manual compaction. This is not the core tasks of Pegasus, it make the code a bit of smelly. At least, you need to make this patch to resolve these issues mentioned above.

In high version of PEGASUS-SPARK , a compaction request will be send to cluster after user call startBulkLoad . And I will update PEGASUS-SPARK this week. I do not think we should remove perodic manual compaction code now . Cause I think all its bugs about manual compaction have been fixed, and they are all minor issues.

Thanks to update the pegasus-spark project! It's not well maintenance for about 4 years.

Do you think it's confusing that the manual compact status depends on last_manual_compact_used_time? Is it possible manual_compact_last_time_used_ms == 0 but manual_compact_last_finish_time_ms > 0 ?

If the paramaters are set in a previous round, and havn't complete compaction and set the vaiables in current round before restarting the server, will it be considered as finished?

Could you add some tests to verify the bug has been fixed?

ninsmiracle commented 2 months ago

It's OK to me to remove the periodic manual compaction function, it can be replaced by thirdparty tools, feel free to remove it.

I think it's reasonable to keep this periodic manual compaction function. Cause many user using spark to set manual compaction time now. If we remove this function in new version, we have to change the change user habits. Actully, I think we lack a unified management platform to control every things.

What is "using spark to set manual compaction time" ? Is it a Pegasus binding method? I don't think it's a block to remove the periodic manual compaction. This is not the core tasks of Pegasus, it make the code a bit of smelly. At least, you need to make this patch to resolve these issues mentioned above.

In high version of PEGASUS-SPARK , a compaction request will be send to cluster after user call startBulkLoad . And I will update PEGASUS-SPARK this week. I do not think we should remove perodic manual compaction code now . Cause I think all its bugs about manual compaction have been fixed, and they are all minor issues.

Thanks to update the pegasus-spark project! It's not well maintenance for about 4 years.

Do you think it's confusing that the manual compact status depends on last_manual_compact_used_time? Is it possible manual_compact_last_time_used_ms == 0 but manual_compact_last_finish_time_ms > 0 ?

If the paramaters are set in a previous round, and havn't complete compaction and set the vaiables in current round before restarting the server, will it be considered as finished?

Could you add some tests to verify the bug has been fixed?

Firstly , I update the newest version of PEGASUS-SPARK and commit a merge request. And I think manual_compact_last_time_used_ms == 0 but manual_compact_last_finish_time_ms > 0 will happened before this pr . However , now , in pegasus_server_impl::start replica will reload the parameter from disk . If last_finish_time_ms be set , last_time_used_ms will be set also.The results of the staging environment test verified this point. In this week we will try to put this manual compaction fix version to online environment , and we will verify the effectiveness of bug fixes in large data environments .