Enable backwards propagation of purge points

GoogleCodeExporter commented 9 years ago

1. To which tool/application/daemon will this feature apply?

Tungsten Replicator. 

2. Describe the feature in general

Currently the THL can drop log files that exceed their retention but are still 
needed by applier threads.  This at the very least would force regeneration of 
the THL; in the worst case it creates an unrecoverable failure that forces the 
slave to be re-provisioned. 

Tungsten will be upgraded to propagate the current purge point backwards in the 
pipeline and add the following: 

a. The THL will not delete files that are still needed by appliers during 
normal operation. 

b. It should not be possible to cause the THL to delete needed files through 
any combination of database or replicator crashes and restarts. 

c. Backwards propagation will apply as needed to the master binlog position for 
downloading binlog events.  The THL position is already used to control purging 
of downloaded relay logs, so no additional development is required when 
downloading relay logs to temporary disk files. 

(This feature should also be tied into slave tracking so that we do not delete 
log files needed by currently active slaves.)

3. Describe the feature interface

To be completed as part of design.  

4. Give an idea (if applicable) of a possible implementation

To be completed as part of design. 

5. Describe pros and cons of this feature.

5a. Why the world will be a better place with this feature.

Dropping logs is a persistent problem for slaves.  This feature solves it. 

5b. What hardship will the human race have to endure if this feature is
implemented.

None, other than the trouble of implementing it. 

6. Notes

Original issue reported on code.google.com by berkeley...@gmail.com on 3 Jul 2011 at 11:05

Blocking: #200

GoogleCodeExporter commented 9 years ago

Original comment by berkeley...@gmail.com on 23 Aug 2011 at 2:04

GoogleCodeExporter commented 9 years ago

Issue 200 has been merged into this issue.

Original comment by g.maxia on 23 Aug 2011 at 2:06

GoogleCodeExporter commented 9 years ago

Original comment by berkeley...@gmail.com on 26 Aug 2011 at 12:10

Added labels: Durability

GoogleCodeExporter commented 9 years ago

Original comment by berkeley...@gmail.com on 8 Sep 2011 at 5:17

Added labels: FixedIn-2.0.5

GoogleCodeExporter commented 9 years ago

Original comment by berkeley...@gmail.com on 8 Sep 2011 at 5:18

Removed labels: FixedIn-2.0.6

GoogleCodeExporter commented 9 years ago

Original comment by robert.h...@continuent.com on 27 Nov 2011 at 8:42

GoogleCodeExporter commented 9 years ago

This is fixed.  The final step was to add flow control to relay logs using a 
blocking queue.  This prevents the relay log client from aging out files before 
the extractor can read them.  

Meanwhile it should not be possible under any circumstances to delete files in 
the THL before they are applied to the DBMS.  So far this works within a single 
replicator only.  You can prove that the implementation works as follows: 

1.) Set up a master/slave topology.  On the slave, set the log retention to 1 
minute and make the THL log file size very small, say 1M bytes. 

2.) Set the slave offline and put 30 minutes worth of load on the master.  

3.) Bring the slave online again.  THL files should only be dropped on the 
slave after they have been applied.  

Extending backwards propagation of the slave position to the master so that the 
master does not delete files required by slaves will be addressed in a future 
update and should have a separately defined issue.

Original comment by robert.h...@continuent.com on 28 Nov 2011 at 6:54

Changed state: Fixed

zkfan / tungsten-replicator

Enable backwards propagation of purge points #148