allenbo / TwoPC

A practice of Two PC protocol
GNU General Public License v2.0
0 stars 0 forks source link

Now it's time to add some log and fault tolerance. #2

Open allenbo opened 9 years ago

allenbo commented 9 years ago

Currently there are some logging and fault tolerance happening in code. Some suggestion. 1) Avoid coordinator crash at the first place. Easy to get hands dirty by disregarding such annoying situation. 2) Heartbeat signal from coordinator? 3) Spinning waiting for participants recovery. 4) Universal log for participants or totally ignore log in twopc and let the application do the job instead? 5) Asynchronous communication between twopc seems necessary now. 6) Snapshot, or maybe let the app do the job?

allenbo commented 9 years ago

Break down this packet into several issues. First, tolerate participants crash(same as network partition, hopefully?). Participants recovery from log and snapshot(let the application do the log job here). Then connect to coordinator again. Coordinator just spin waiting for connection if connection closed or time out happened(Coordinator has to shutdown the connection in this case).