is00hcw / tungsten-replicator

Automatically exported from code.google.com/p/tungsten-replicator
0 stars 1 forks source link

trepctl status can generate a NullPointerException when called #1073

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?

This is a follow-up of Issue 1053, which was not completely resolved.  Running 
trepctl status intermittently results a NullPointerException if the connection 
cannot be allocated *or* the following message within the log: 

WARN  extractor.mysql.MySQLExtractor Unable to run SHOW MASTER STATUS to find 
log position; this can occur when service is going on/offline
INFO   | jvm 1    | 2014/12/05 04:52:06 | java.lang.NullPointerException

What is the expected output?

Trepctl status should either print NONE if the connection cannot be safely 
allocated or ERROR if there is a SQL error during execution of the command to 
check the master binlog position. 

What do you see instead?

The above-described error. 
...

What is the possible cause?

This problem is due to a race condition when we connect to the DBMS while the 
replicator is going offline. The cause is simple. Pipelines have a data source 
service, which provides access to the DBMS. Normally this service is the first 
component to be configured and prepared, and also the last component to be 
released. This means that any other pipeline component can safely connect to 
the DBMS and release connections because the pipeline lifecycle is a state that 
guarantees that the data source service's life cycle brackets life cycles of 
all other components.

Unfortunately in the case of a status command processing is out of band. This 
leads to cases where we ask for connections when the data source is shutting 
down. Depending on the timing there are several ways to get 
NullPointerExceptions.

What is the proposed solution?

There is a reasonable extension that avoids a lot of coding but precludes NPEs.

1.) The SqlDataSource class will offer a special API to allocate an out-of-band 
connection that does not have to be returned to release it. Instead, the client 
is responsible for calling Connection.close().
2.) The SqlDataSource methods will be synchronized so that this API either 
returns a connection safely or returns null if it cannot allocate the 
connection.

Additional information

...

Use labels and text to provide additional information.

Original issue reported on code.google.com by robert.h...@continuent.com on 11 Dec 2014 at 11:54

GoogleCodeExporter commented 9 years ago
This issue was updated by revision r2708.

Added new call to SqlDataSource to allocate client managed JDBC connection so 
that status calls on MySQL do not generate a NullPointerException due to race 
conditions on pipeline startup and shutdown. 

Original comment by robert.h...@continuent.com on 11 Dec 2014 at 11:58

GoogleCodeExporter commented 9 years ago
Fixed as proposed in description.  NPEs begone!

Original comment by robert.h...@continuent.com on 11 Dec 2014 at 11:58

GoogleCodeExporter commented 9 years ago
Can this be moved to QA status and who could test it with the least amount of 
effort?

Original comment by linas.vi...@continuent.com on 12 Dec 2014 at 7:26

GoogleCodeExporter commented 9 years ago

Original comment by g.maxia on 12 Dec 2014 at 7:27

GoogleCodeExporter commented 9 years ago
The issue is fixed. There is no easy way to reproduce it, because it is 
originated by a race condition. We can easily observe it when it happens, but 
triggering it is a matter of chance.
We can only say that it hasn't shown up in the last ~ 1000 tests where it was 
observed previously. 

Original comment by g.maxia on 19 Dec 2014 at 7:42

GoogleCodeExporter commented 9 years ago

Original comment by linas.vi...@continuent.com on 20 Jan 2015 at 9:39

GoogleCodeExporter commented 9 years ago

Original comment by linas.vi...@continuent.com on 20 Jan 2015 at 9:42