Cleanup to use a common SnapshotConnectionListener for handle snapshot transactions
use a single SnapshotConnectionListener for every TaskContext (maintained in a map)
and thus have a single connection shared among all the plans in a single Task
removed all other custom TaskCompletionListeners that handle connection close/commit etc
removed all code to explicitly start snapshot transactions, obtain TXID, transfer to
connections etc since everything is now handled by the common SnapshotConnectionListener
having a single shared connection
maintain separate snapshot-enabled and disabled pools where the latter can temporarily
have snapshot transactions running on them (due to row table being joined with column
table, for example) whose isolation level is explicitly reverted at the end
use a temporary TaskContext for cases where it is not available in ColumnBatchIterator
to enable using the common SnapshotConnectionListener consistently
added metrics for number of column batches compacted and rolled over in update/delete
plans which are handled using before-commit results in SnapshotConnectionListener
added more metrics like number of deleted/updated batches to delete/update
removed custom transaction handling in ColumnInsertExec and instead everything is now
taken care of by SnapshotConnectionListener like for other plans
added test that does concurrent updates/deletes/selects which will lead to continuous
compactions and rollovers (code for former will be added in the following checkins);
fixed multiple causes for failures in this new test (common SnapshotConnectionListener
itself being the major fix)
updated tests for the above changes
updated tomcat-jdbc to latest version 10.0.10
force start transaction immediately (instead of on first TX related operation) in
SnapshotConnectionListener since the thread-local TX is used by PRValuesIterator
in its contructor to create transactional local/remote iterators
added a ClusteredColumnIterator.fillColumnValues() method to pre-fill all the projected
columns instead of doing on first getColumnValue() call; this method returns a boolean
to indicate whether all column values were found or not; now used by ColumnTableScan
and its caller in StoreCallbacksImpl to skip a batch if any of the column values are missing
(due to concurrent deletes/updates) instead of throwing an EntryNotFoundException later
added cleanup of any remaining statsRows for case of premature RemoteEntriesIterator.close()
made logging methods in Logging as public rather than protected
batch compaction
added batch compaction which is done when a batch has large number of deletes/updates
as a before-commit action
the new Spark-level property snappydata.column.compactionRatio (default is 0.1 i.e. 10%)
determines at what point compaction will be triggered
results of the compaction are captured by SnapshotConnectionListener and compaction
metric of update/delete operation is incremented if set
split out the code to use thread-local connection from StoreCallbacksImpl to Utils
which is now used by compactor too
if SnapshotConnectionListener picks up an EmbedConnection from current context, then
don't close or commit/rollback it since that will be taken care of by the origin
of the operation (which should be a remote node)
removed the fillColumnLobs/fillColumnValues methods added in the iterator previously
and their calls in ColumnTableScan since it is safer to throw an EntryDestroyException
to let the task (and be retries) fail rather than silently skipping the batch
removed the TaskCompletionListener to close encoders in ColumnInsertExec since they
will hold references to the created batches which can build up if a large number of
compactions get created in a single Task, so moved it to a finally block in generated code
Patch testing
precheckin -Pstore; transactional hydra tests
ReleaseNotes.txt changes
Release 1.3.0: added column compaction when number of updates/deletes exceed the limit specified by spark property snappydata.column.compactionRatio (default is 0.1)
Changes proposed in this pull request
Cleanup to use a common SnapshotConnectionListener for handle snapshot transactions
use a single SnapshotConnectionListener for every TaskContext (maintained in a map) and thus have a single connection shared among all the plans in a single Task
removed all other custom TaskCompletionListeners that handle connection close/commit etc
removed all code to explicitly start snapshot transactions, obtain TXID, transfer to connections etc since everything is now handled by the common SnapshotConnectionListener having a single shared connection
maintain separate snapshot-enabled and disabled pools where the latter can temporarily have snapshot transactions running on them (due to row table being joined with column table, for example) whose isolation level is explicitly reverted at the end
use a temporary TaskContext for cases where it is not available in ColumnBatchIterator to enable using the common SnapshotConnectionListener consistently
added metrics for number of column batches compacted and rolled over in update/delete plans which are handled using before-commit results in SnapshotConnectionListener
added more metrics like number of deleted/updated batches to delete/update
removed custom transaction handling in ColumnInsertExec and instead everything is now taken care of by SnapshotConnectionListener like for other plans
added test that does concurrent updates/deletes/selects which will lead to continuous compactions and rollovers (code for former will be added in the following checkins); fixed multiple causes for failures in this new test (common SnapshotConnectionListener itself being the major fix)
updated tests for the above changes
updated tomcat-jdbc to latest version 10.0.10
force start transaction immediately (instead of on first TX related operation) in SnapshotConnectionListener since the thread-local TX is used by PRValuesIterator in its contructor to create transactional local/remote iterators
added a ClusteredColumnIterator.fillColumnValues() method to pre-fill all the projected columns instead of doing on first getColumnValue() call; this method returns a boolean to indicate whether all column values were found or not; now used by ColumnTableScan and its caller in StoreCallbacksImpl to skip a batch if any of the column values are missing (due to concurrent deletes/updates) instead of throwing an EntryNotFoundException later
added cleanup of any remaining statsRows for case of premature RemoteEntriesIterator.close()
made logging methods in Logging as public rather than protected
batch compaction
added batch compaction which is done when a batch has large number of deletes/updates as a before-commit action
the new Spark-level property snappydata.column.compactionRatio (default is 0.1 i.e. 10%) determines at what point compaction will be triggered
results of the compaction are captured by SnapshotConnectionListener and compaction metric of update/delete operation is incremented if set
split out the code to use thread-local connection from StoreCallbacksImpl to Utils which is now used by compactor too
if SnapshotConnectionListener picks up an EmbedConnection from current context, then don't close or commit/rollback it since that will be taken care of by the origin of the operation (which should be a remote node)
removed the fillColumnLobs/fillColumnValues methods added in the iterator previously and their calls in ColumnTableScan since it is safer to throw an EntryDestroyException to let the task (and be retries) fail rather than silently skipping the batch
removed the TaskCompletionListener to close encoders in ColumnInsertExec since they will hold references to the created batches which can build up if a large number of compactions get created in a single Task, so moved it to a finally block in generated code
Patch testing
precheckin -Pstore; transactional hydra tests
ReleaseNotes.txt changes
Release 1.3.0: added column compaction when number of updates/deletes exceed the limit specified by spark property snappydata.column.compactionRatio (default is 0.1)
Other PRs
https://github.com/TIBCOSoftware/snappy-store/pull/569