datatechnology / cornerstone

C++ implementation of raft consensus
Apache License 2.0
264 stars 60 forks source link

A few questions about source code #64

Open iTomGeller opened 1 week ago

iTomGeller commented 1 week ago

Hi, I have a few questions to ask about the source code that are out of my reach.

  1. The lock use (1) In async.hxx, we immediately release the lock after we set the result, but in later if-block we call handler to process current result and err which may not be the result and err that we passed in originally. image (2) In raft_server.cxx, we acquire the recur_lock first. But this function is clearly not recursive, why do we need a recur_lock to exclude other threads? And why do we allow the only accepted thread to call this function more than once?(This recur_lock usage also appears in a lot of other functions like process_req in raft_server_req_handlers.cxx and handle_election_timeout in raft_server.cxx) image
  2. The election_timer In raft_server_req_handlers.cxx, we restart the election_timer when the server receives an illegal snapshot-req from peer of the same term. image Why can't we just ignore the illegal request and continue, what is the purpose of restarting election_timer here?

3.The shared_ptr usage As said in cppreference,

Constructing a std::shared_ptr for an object that is already managed by another std::shared_ptr will not consult weak_this and thus will lead to undefined behavior.

However in cornerstone, it is frequently seen that ptr<T>(cs_new<T>(...)) is used which pass another shared_ptr instance as the parameter of the ptr, violating the rule and thus may lead to undefined behaviour. image image

I would appreciate it if anyone could help me out here!

andy-yx-chen commented 6 days ago

for (1), set_result is not expected to be call multiple times, you can add an assert after lock assert(!hasresult); for (2), that's just because the lock type is recursive_mutex, you can change the lock type to mutex but need a bit changes. for (3), this is not an example for c++ programming language class

iTomGeller commented 5 days ago

1.

for (3), this is not an example for c++ programming language class

I don't quite understand what does this mean.The ptr is short for shared_ptr in cornerstone, and the cs_new aims to create a shared_ptr instance which overall adds up to the circumstance described in the cppreference notes and should avoid using ptr<T> some_object(cs_new<T>(xxx)) to guarantee memory safety.

2. The election timer issue is still unsolved.Why should we have to restart the election_timer when we receive illegal install-snapshot-req from followers of the same term rather than just ignore it?Could you please help solve it?@andy-yx-chen

andy-yx-chen commented 5 days ago

for (3), this is not an example for c++ programming language class

I don't quite understand what does this mean.The ptr is short for shared_ptr in cornerstone, and the cs_new aims to create a shared_ptr instance which overall adds up to the circumstance described in the cppreference notes and should avoid using ptr<T> some_object(cs_new<T>(xxx)) to guarantee memory safety.

The election timer issue is still unsolved.Why should we have to restart the election_timer when we receive illegal install-snapshot-req from followers of the same term rather than just ignore it?Could you please help solve it?@andy-yx-chen

for 1, I am pretty sure you do not get what the document says in the link you posted for 2, because it's a follower

iTomGeller commented 3 days ago

for (3), this is not an example for c++ programming language class

I don't quite understand what does this mean.The ptr is short for shared_ptr in cornerstone, and the cs_new aims to create a shared_ptr instance which overall adds up to the circumstance described in the cppreference notes and should avoid using ptr<T> some_object(cs_new<T>(xxx)) to guarantee memory safety. The election timer issue is still unsolved.Why should we have to restart the election_timer when we receive illegal install-snapshot-req from followers of the same term rather than just ignore it?Could you please help solve it?@andy-yx-chen

for 1, I am pretty sure you do not get what the document says in the link you posted for 2, because it's a follower 1. After a test on the shared_ptr constructor,I see that ptr<T> some_object(cs_new<T>(xxx)) works.


std::shared_ptr<MyClass> myPtr = std::make_shared<MyClass>();
std::cout << "Reference count after creation: " << myPtr.use_count() << std::endl;
std::shared_ptr<MyClass> anotherPtr = myPtr;
std::cout << "Reference count after copy: " << myPtr.use_count() << std::endl;

the copy-construction using `std::make_shared` is ok and the count is correct 1, though I still don't know why.
2.
It is in the body of `handle-install-snapshot-req`, and clearly is not in the election process.
Would it be different if the role is follower and if so,in which step of the snapshot-install process have we need to consider the election?  
@andy-yx-chen